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PREFACE 


This book, with apologies for the pretentious title, represents the text of a course 
ha vp hften teaching at Harvard for the past eight years. The course is aimed 
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variable calculus. Some prior acquaintance with linear algebra is helpful but not 
necessary. Most of the students simultaneously take an intensive course in physics 
and so are able to integrate the material learned here with their physics education. 
This also is helpful but not necessary. The main topics of the course are the theory 
and physical application of linear algebra, and of the calculus of several variables, 
particularly the exterior calculus. Our pedagogical approach follows the ‘spiral 
method’ wherein we cover the same topic several times at increasing levels of 
sophistication and range of application, rather than the ‘rectilinear approach’ of 
strict logical order. There are, we hope, no vicious circles of logical error, but we 
will frequently develop a special cas e of a subject, and th e n r e turn to it for a more 
general definition and setting only after a broader perspective can be achieved 
through the introduction of related topics. This makes some demands of patience 
and faith on the part of the student. But we hope that, at the end, the student is 
rewarded by a deeper intuitive understanding of the subject as a whole. 

Here is an outline of the contents of the book in some detail. The goal of tfaT 
first four chapters is to develop a familiarity with the algebra and analysis of 



But we always formulate the results with the higher-dimensional case in mind. We 

begin Chapter 1 by explaining the relation between the multiplication law of 2 x 2 

matrices and the geometry of straight lines in the plane. We develop the algebra 

of 2 x 2 matrices and discuss the determinant and its relation to area and 

orientation. We define the notion of an abstract vector space, in general, and 

explain the concepts of basis and change of basis for one- and two-dimensional 
vecto r snarpg_ 
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In Chapter 2 we discuss conformal linear geometry in the plane, that is, the 
geometry of lines and angles, and its relation to certain kinds of 2 x 2 matrices. 
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quantum mechanics. We use these notions to give an algorithm for computing 
the powers of a matrix. As an application we study the basic properties of Markov 
chains. 

The principal goal of Chapter 3 is to explain that a system of homogeneous 
linear differential equations with constant coefficients can be written as du/d t = Au 
where A is a matrix and u is a vector, and that the solution can be written as 
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what is meant by the exponential of a matrix. We also describe the qualitative 
behavior of solutions and the inhomogeneous case, including a discussion of 


Chapter 4 is devoted to the study of scalar products and quadratic forms. It is 
rich in physical applications, including a discussion of normal modes and a detailed 
treatment of special relativity. 

Chapters 5 and 6 present the basic facts of the differential calculus. In Chapter 5 
we define the differential of a map from one vector space to another, and discuss 
its basic properties, in particular the chain rule. We give some physical applications 
such as Kepler motion and the Born approximation. We define the concepts of 
directional and partial derivatives, and linear differential forms. 

Tn Chapter 6 we con ti nue the study of the d ifferential calculu s. We present the 
vector v e rsions of th e mean-value theorem, of Taylor’s formula and of th e invers e 
function theorem. We discuss critical point behavior and Lagrange multipliers. 

Chanters 7 and 8 are meant as a first introduction to the integral calculus. 





be applied to a physical theory - optics. It is all in the nature of applications, and 
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In Chapter 10 we go back and prove the basic facts about finite-dimensional 
vector spaces and their linear transformations. The treatment here is a straight¬ 
forward generalization, in the main, of the results obtained in the first four chanters 


in the two-dimensional case. The one new algorithm is that of row reduction. Two 




those of the dual space and the quotient space. These concepts will prove crucial 
in what follows. 
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matrices The subject is developed axiomatically, and the basic computational 
algorithms are presented. 
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that is, algebraic topology. In Chapter 12 we begin the study of electrical networks. 
This i nvolve two aspects. One is the study of the ‘wiring* of the network, that is, 
how the various branches are interconnected. In mathematical language this is 
known as the topology of one-dimensional complexes. The other is the study of 
how the network as a whole responds when we know the behavior of the individual 
branches, in particular, power and energy response. We give some applications to 


In Chapter 


we continue the study ot electrical networks. We examine the 
problems associated with capacitive networks and use these 
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In Chapter 14 we give a sketch of how the one-dimensional results of Chapters 12 
and 13 generalize to higher dimensions. 

Chapters 15-18 develop the exterior differential calculus as a continuous version 
of the discrete theory of complexes. In Chapter 15 the basic facts of the exterior 
calculus are presented: exterior algebra, fc-forms, pullback, exterior derivative and 
Stokes’ theorem. 

Chapter 16 is devoted to electrostatics. We suggest that the dielectric properties 
of the vacuum give the continuous analog of the capacitance of a network, and 




Chapter 17 continues the study of the exterior differential calculus. The main 
topics are vector fields and flows, interior products and Lie derivatives. These are 
applied to magnetostatics. 

Chapter 18 concludes the study of the exterior calculus with an in-depth 
discussion of the star operator in a general context. 

Chapter 19 can be thought of as the culmination of the course. It applies the 



But Chapters 1-9, 20 and 21 would form a self-contained unit for a shorter course. 

The material in Chapter 20 is a rela tively standard treat ment of the theory of 
functions of a complex variable, suitable for students at the level of this book. 
Chapter 21 discusses some of the more elementary aspects of asymptotics. 
Chapter 22 shows how the exterior calculus can be used in classical thermo¬ 
dynamics, following the ideas of Born and Caratheodory. 
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1,IUSI U1 ine mathematics and all of the physics presented in this book were 
eveloped by the first decade of the twentieth century. The material is thus at 
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elementary courses (although most of it with the possible exception of network 
theory must be learned for a grasp of modern physics, and is studied at some stage 
of the physicist’s caree r ). The reasons are largely historical. If was apparent to 
Hamilton that the real and complex numbers were insufficient for the deeper study 
of geometrical analysis, that one wants to treat the number pairs or tri plets of 
the Cartesian geometry in two and three dimensions as objects in their own right 
with their own algebraic properties. To this end he developed the algebra of 
quaternions, a theory which had a good deal of popularity in England in the 
middle of the nineteenth century. Quaternions had several drawbacks: they more 
naturally pertained to four, rather than to three dimensions - the geometry of 
three dimensions appeared as a piece of a larger theory rather than having a 
natural existence of its own; also, they have too much algebraic structure, the 
r e lation between quaternion multiplication, for example, and geometric construe — 
tions in three dimensions being somewhat complicated. (The first of these objections 
would, of course be regarded far less seriously today. But it would be replaced by 
an objection to a theory that is limited to four dimensions.) Eventually, the three — 
dimensional vector algebra with its scalar and vector products was distilled from 
the theory of quaternions. It was conjoined with the necessary differential 
operations, and give rise to the vector analysis as finally developed by Gibbs and 
promulgated by him in a famous and very influential text. 

So vector analysis, with its grad, div, curl etc, became the standard language in 
which the geometric laws of physics were taught. Now while vector analysis is 
well s uited to the ge o metry of three-dimensional Euclidean space, it has a n umber 
of serious drawbacks. First, and least serious, is that the essential unity of the 
subject is obscured. Thus the fundamental theorem of the calculus, Green’s theorem, 
Gauss’ th e orem and Stokes’ theorem are all asp e cts of the sam e theor e m (now 
called Stokes’ theorem). But this is not at all clear in the vector analysis treatment. 
More serious is that the fundamental operators involve the Euclidean structure 
(for example grad and div) or the three - dimensional structure and orientation as 
well (for example curl). Thus the theory is wedded to a three-dimensional orientated 
Euclidean space. A related problem is that the operators do not behave nicely 
under general changes of coordinates - their expression in non-rectangular co- 
ordinales being unwieldy. Already Poincare, in his fundamental scientific and 
philosophical writings which led to the theory of relativity, stressed the need to 
distinguish between those laws of geometry and physics which are ‘topological’, 
i.e. depend only on the differential structure of space and so are invariant under 
smooth deformations, and those which depend on more geometrical structure such 
as the notion of distance. One of the major impacts of the theor y o f relativity on 
math e matics was to encourage the study of higher-dimensional spaces, a study 
which had existed in the previous mathematical literature, but was not regarded 
as central to the study of geometry. Another was to emphasize general coordinate 
changes. The vector analysis was not up to these two tasks and so was supplemented 
in the more advanced literature by tensor analysis. But tensor analysis with its 
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ble of indices has a number of serious drawbacks, the most serious of which 
bei ng t h at it is extraordinarily difficult to tell which operations have any geometric 
significa nc e a nd wh ich are artifacts of the coordinate system. Thus, while it is 
reasonably well-suited for computation, it is hard to assess exactly what it is that 
one. is computing. The whole purpose of the development initiated by Hamilton - to 
have a calculus whose objects have a perceived geometrical significance - was 
vitiated. In order to make the theory work one had to introduce a relatively 
sophisticated geometrical construct, such as an affine connection. Even with such 
constructs the geometric meanings of the operations are obscure. In fact tensor 
analysis never displaced the intuitively clea r v ector analysis from 


curriculum. 

It is generally accepted in the mathematics community, and gradually being 
acce pte d in the physics community, that the most suitable framework for geo- 
metrical analysis is the exterior differential calculus of Grassmann and Cartan. This 
calculus has the advantage that its computational rules are simple and concise, 
that its objects have a transparent geometrical significance, that it works in all 
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dimensions, that it behaves well under maps and changes of coordinates, that it 
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between the ‘topological’ and ‘metrical’ properties. The geometrical laws of physics 
take on a simple and elegant form in terms of the exterior calculus. To emphasize 
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not appreciated by the mathematical community and was dismissed by the leading 
German mathematicians of his time. In fact, Grassmann was never able to get a 
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his career. (Nevertheless, he seemed to have a happy and productive life. He raised a 
nd was recognized as an expert on Sanskrit literature.) Towards the 
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fared no better than the first. Only one or two mathematicians of his time, such as 
Mobius, appreciated his work. Nevertheless, the Ausdehnungslehre (or calculus of 
extension) contains for the first time many of the notions central to modern 




spaces, exterior algebra, exterior and interior products and a form of the generalized 
Stokes’ theorem all make their appearance. 

of our century. His early work, of such overwhelming importance for modern 
jmathematics, on Lie groups and on systems of partial differential equations was 
done inrrelative obscurity. fiut, by thp 1920s, his work became known to the broad 
mathematical community, due, in part, to the writings of Hermann Weyl who 
presented novel expositions of his work at a time when the theory of Lie groups 
began to play a central role in mathematics and in physics. Cartan’s work on the 
theory of principal bundles and connections is now basic to the theory of elementary 
particles (where it goes under the generic name of‘gauge theories’). In 1922 Cartan 
published his book Lemons sur les invariants integraux in which he showed how 




only for geometry but also for the variational calculus and a wide variety of 
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>ropriate vehicle for the formulation of the geometrical laws of physics. 
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curriculum and have proceeded accordingly. 

Some explanation is in order for the time and effort devoted to the theory of 
electrical networks, a subject not usually considered as part of the elementary 
curriculum. First of all there is a purely pedagogical justification. The subject 
always goes over well with the students. It provides a down-to-earth illustration 
of such concepts as dual space and quotient space, concepts which frequently seem 














definition, and a natural one at that. This serves to motivate the d operator and 


philosophical reasons for our decision to emphasize network theory. It has been 
recognized for about a century that the forces that hold macroscopic bodies 




the notion of rigid body and Euclidean geometry makes sense, that is, in the 
non-relativistic realm) the concept of a rigid body, and hence of Euclidean geometry, 
derives from electrostatics. The frontiers of physics, both in the very small (the 
study of elementary particles) and the very large (the study of cosmology) have 




time. We thought it wise to bring some of the issues relating geometry to physics 



hoped that our discussion may be of some use to those who will have to deal with 
this problem in the future. 

Of course, we have had to omit several important topics due to the limitation 
of a one-year course. We do not discuss infinite-dimensional vector spaces, in 
particular Hilbert spaces, nor do we define or study abstract differentiable manifolds 
and their properties. It has been our experience that these topics make too heavy 
a de mand on the sop histica tion of the student, a nd the effort i nvolv ed in exp laining^ 
them is best expended elsewhere. Of course, at various places in the texrwe have 
to pay the price for not having these concepts at our disposal. More serious is the 
omission of a serious discussion of Fourier analysis, classical mechanics and 
probability theory. These topics are touched upon but not presented as a coherent 
subject of study. Our only excuse is that a thorough study of each would probably 
require a semester’s course, and substantive treatments from the modern viewpoint 
are available elsewhere. A suggested guide to further reading is given at the end 
of the book. 

We would like to thank Prof. Daniel Goroff for a careful reading of the 
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Chapters 12-14 are meant as a gentle introduction to the 
mathematics of shape, that is, algebraic topology. In Chapter 
12 we begin the study of electrical networks. This involves two 
aspects. One is the study of the ‘wiring’ of the network, how 
the various branches are interconnected. In mathematical 
language this is known as the topology of one-dimensional 
complexes. The other is the study of how the network as a 
whole responds, when we know the behavior of the individual 
branches, in particular as regards power and energy. We give 
some applications to physically interesting networks. 


Introduction 

Electrical circuit theory is an approximation to electromagnetic theory in which 
it is assumed that the interesting phenomena can all be described in terms of 
what is happening along the wires and other parts of an electrical circuit. It further 
assumes that the circuit can be decomposed into various components, each with 
a specified mode of behavior, and the problem i s t o predict how the system as 
a whole will behave when the components are interconnected in various ways. 

The basic variables of circuit theory are familiar from household appliances; 
they are current, voltage and power. A fundamental unit in electromagnetic theory 
is the charge of the electron. As of this writing (prior to the discovery of quarks) 
no known particle has a charge that is a fractional part of the charge of the 
electron. The practical unit of charge is the coulomb, which represents the 
negative of the charge of 6.24 x 10 18 electrons. In 18 1 9 Oersted observed that a 
flow of electric charge produced a force on a magnetic needle, and that the force 
was proportional to the rate of flow of charge. The measurement of this effect is 
much easier than the measurements of electrostatic forces that are needed to 
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measure charge. For this reason, current, rather than charge, is a basic variable 
of circuit theory. The unit of current is the ampere, where l ampere = 1 coulomb/ 

t 
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where / is current, measured in amperes, Q is charge, measured in coulombs, 
and t is time, measured in seconds. In general, we would expect that the current 
flowing through a circuit would depend on position. In circuit theory it is assumed 
that the current takes on a definite value at each component (but may be time- 
dependent). If « denotes a branch of the circuit, then we let I„ ( t ) denote the current 
flowing through a at time t. 




of a circuit. The energy change (measured in joules) per unit charge (measured in 
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branch a will be denoted by V a (t). 

The product of current and voltage has the units of energ y /time which is known 
as power. The unit of power corresponding to the units that we have introduced 
above is called the watt. Thus 

lwatt=lvoltxampere=Tjaule/seeend^ 

In circu i t theory , it i s assumed that there are three k i nds of branches: inductors , 
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resistor only 



real battery 

Figure 12.2 


ideal battery 


called the resistance of the branch. Its graph in the IV -plane is a straight line 


any device which can be described by an (inhomogeneous) linear equation in I 
and V. For example, a real battery, with internal resistance R a , can be described^ 
by V a = W a + R a I a , where the constant W a is called the emf of the battery. An 




is described by V a = W a , whose graph is a horizontal straight line, while an ideal 
current source, which provides current K a no matter what the voltage across its 


In analyzing circuits in which the voltages and currents change with time, we 
ust consider sources of voltage and current which vary with time. In this case, 


v\ t ) - W %t) = RJIM - KM 

where W\t) and K a (t) are specified functions of time. 

More generally, we might consider nonlinear resistors, inductors, or capacitors. 







where the only restriction is that no derivatives of / or V appear. 
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This would be the case, for example, if an inductor with an iron core were 
used for large currents. Similarly, we might consider a nonlinear capacitor, described 

bT 




which could be the result of using a dielectric with a nonlinear response. 

Little more will be said about nonlinear devices, but it is important to recognize 
that most of the theory which follows, which is concerned primarily with setting 
up rather than solving the equations for electrical networks, applies with equal 
validity to linear and nonlinear elements. 

We now assume that our electrical circuit is built out of b branches of the three 


types just mentioned. The branches are connected at their end points to one 
another in some wav. We wish to determine the currents and voltages in all the 
branches, 2b unknowns in all. Each branch gives one equation (either a differential 
equation or a functional relation involving the current and voltage through that 
branch). We need b more equations. These are given by what are now known as 
Kirchhoff’s laws. 

Kirchhoff, as a student in Neumann’s seminar, made the first comprehensive 
study of the network problem. He published his results in 1845 and 1847. He 
proved the existence of a solution to the network problem for a passive linear 
purely resistive network; i.e., for one in which there are only passive lin e ar resistors. 
In solving this problem, he was one of the first-to study the algebraic properties 
of shape. This abstract study of shape was created, as a mathematical discipline, 
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subject is called algebraic topology. The natures of the methods of algebraic topology 
make themselves apparent in the case of passive linear resistive networks, and so 
we shall occupy a considerable amount of space studying these networks before 
returning to the general case. In treating these networks, we shall follow a 1923 
paper by Weyl, in which a proof of Kirchhoff’s results is presented in a fashion 
that explains more directly the relations with algebraic topology. 

Kirchhoff’s laws, as restated by Maxwell, have a very simple formulation. 
Kirchhoff’s current law asserts that since charg e cannot b e created or destroyed, 
and since no charge can be stored at an ideal point, the algebraic sum of all the 
currents entering or leaving a junction of branches must be zero. Kirchhoff’s voltage 
law asserts that there exists a function, <B, called the electrostatic potential, 
such that the voltage across each branch is given by the difference of the values 
of Q at the end points, i.e., the two junctions of the branch. Maxwell devised two 
methods of solving the resistive network problem which are known as Maxwell’s 
mesh-current and node-potential m e thods. W e shall b e gin by working some 
examples to illustrate Maxwell’s methods, and, in the process, set up some of the 
language of algebraic topology. 


12.1. Linear resistive circuits 

Resistors connected by metal wires of negligible resistanee r as shown in figure 12.5, 
are said to be connected in parallel. Suppose that a battery supplying a constant 
voltage, V, is connected across the group of resistors. The ith resistor has resistance 
R t and we set G t = Rf 1 . (G { is called the conductance of the ith resistor.) We are 
interested in calculating the total current delivered by the battery and the current 
in each branch after the current has become steady. The connection between all 
the upper terminals of the resistors ensures that in the steady state all these 
terminals are at the same potential, and the same applies to the lower terminal. 
Hence the voltage across each resistor is the same, and is equal to the voltage, V, 
o f th e b attery. From the point of vie w of circuit theory, we can i magine all of th e 
upp e r wire shrunk to a point, and similarly all of the wire connecting the lower 
terminals. We may thus replace figure 12.5 by figure 12.6 in which there are two 
vertices A and B, the top and the bottom, and n 4- 1 branches, of which one is the 
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Figure 12.6 


battery and the rest are linear passive resistors. Since the same voltage, V, is applied 
across all the resistors, it follows from the equation V = R1 or I — GV that the 
currents through the resistors are G t V, G 2 V, ••, G„V. By Kirchhoff’s current law, 
the current flowing through the battery branch must be equal and opposite to the 
sum of the currents flowing through the resistors, and is therefore 


sum ot tne currents nowing tnrougn me resistors, and is tneretore 

-(G 1 +G 2 + - + GJF. _ 

(In deriving this result, we are making the sign convention that all branches 
in figure 12.6 are given similar orientations, so that, for example, all currents flowing 

as negative.) We have completely solved this trivial network problem in that we 
now know the voltage across and the current through each branch. It follows 
from the above result that the total current supplied by the battery is the same 
as would be supplied if the battery were connected across a single resistor of 


conductance 


i.e., of resistance R where 


This is, of course, the well-known result, taught in all elementary courses, 
which states that, if a number of resistors are connected in parallel, they are 
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equivalent to a single resistor the reciprocal of whose resistance is equal to the 
sum of the reciprocals of the resistances of the individual resistors. 

_A g roup of resistors connected by wir es of negligible re sistance as shown in 

figure 12.7 is said to be connected in series. Suppose that a battery of voltage V is 
connected as shown, and that, as a result, a steady current / flows around the 
circuit. (By Kirchhoffs current law we must have the same current, I, flowing 
through each branch, because the algebraic sum of the currents at each node must 
be zero.) Let th e resistances of the various resistors be R 1 ,R 2 ,...,R n . Since the 
current I flows through the ith resistor, it follows that the voltage across the ith 
resistor is R,Z. It follows from Kirchhoffs voltage law that the voltage across thp 
battery must equal the sum of the voltage differences across all the r e sistors. Thus 

V = (R t + R 2 + ••• + R n )-f- 


Since we know V, we can solve this equation for I and thus obtain the currents and 
voltages through e a ch br a nch . We h a ve completely solved this network problem. 
Again we have me r ely rep r oduced the well-known elementa r y r esult which s t ates 
that if a number of resistors are connected in series they are equivalent to a single 
resistor whose resistance is the sum of the resistances of the individual resistors. 


Let us now consider a slightly more complicated circuit consisting of a battery of 
voltage V connected to three resistors of resistances R 1 ,R 2 and R 3 , as shown in 
f ig u r e 12 . 8. A p oint i n the network from whic h two or more wires run to d i ff erent 
elements is called a node or a vertex. The point A is a node from which wires run to 
the battery and to one of the resistors. The point B is a node from which wires run to 
all three resistors. The point C is a node from which there run three wires, one to the 
battery and two to different resistors. We do not consider the lower right-hand 
corner as a separate node; since it is connected by a wire of zero resistance to the 
point C, it must be identified as being the same node as C. Thus the circuit has three 
nodes, A, B, and C, and four branches, the battery branch joining A to C, the resi s tor 
joining A to B, and the two resistors joining B to C. The lengths and shapes of the 
wires making the interconnections are completely irrelevant. What matters are the 
branches and the nodes and how the branches and nodes are interconnected. Th us 
figure 12.9 describes exactly the same circuit as figure 12.8. A network as simple as 
that shown in figure 12.8 or 12.9 can be analyzed into parallel and series connections, 
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Figure 12 . 8 









resistors i\ 2 ana k 3 are in paranei, ana nence are equival e nt, as iar as me rest oi 
the circuit is concerned, to a single resistor whose resistance is the reciprocal of 
R 3 1 4- R 3 l , i.e.. of resistance 

R2R3 


R z + R3 


This equivalent resistor is connected in series with the resistor R x . Thus, as far as the 
battery is concerned, it is as if there were just a single resi s tor of re s istance 

R2R3 


R = R t + 


Rj + R-. 


Thus the current drawn from the battery is 

/ = V/R. 

This must also be the current through R l5 so that l x = V/R is the current through 
Ri. The voltage drop across R t is then V 1 = I X R V The current / divides [between the 
parallel resistors R 2 and R 3 in proportion to the inverse of their resistances, as wc 
have seen when we discussed the example of resistors in parallel. Thus the currents 
through R 2 and K 3 are 


/,=■ 


RJ 


Rj + Ri 


and U = 


r 2 i 


R 2 + R:■ 


From this we see that the common voltage drop across R z and R 3 is 


^2 + ^3 

In this cas e w e have obtained all th e relevant information about the n e twork by 
considering it as a resistor R t in series with a pair of parallel resistors R 2 and R 3 . 
This procedure is frequently the most convenient way of proceeding when the 
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complicated networks. We will now illustrate the two methods of Maxwell for this 
same simple network. 

Before doing so. we shall draw the network once again, but this time just 
indicating the branches (with their orientations), the nodes, and how they are joined 
together; but not indicating the nature of the individual branches. There are three 
nodes, A, B and C, and four branches, a, /?, y, and d. The branch a goes from A to B. 
We shall write this fact symbolically as 

da = B — A. 


B 



Figur e 12.10 


The symbol d is called the boundary operator, and the above equation is interpreted 

branch leaves A, we count A with a minus sign, and since it goes toward B, we count 
B with a plus sign. The remaining boundary relations are 

d/i = C-B, dy = C-B, and dd = A-C. 

By a path we shall mean a succession of branches, each t r ave r sed in its p r ope r 
direction or backwards, so that the end point of one segment is the origin of the next 
in the succession. Thus the path a + /? goes from A to B and then from B to C. The 
boundary of this path is d(a + fi) = B — A + C — B — C — A. The path ft — y goes 
from B to C along ft and then from C to B along y. It has no boundary, d(fi — y) = 
C — B — (B — C) — 0. We can think of a path as giving a succession - node, branch, 
node, branch,..., branch, node - in which each branch is flanked by the two points 
of its boundary. The path is said to join the first and the last points in the succession. 
The path is called closed if the first and the last points coincide. In general, we do not 
suppose that all the branches in a path are distinct; we can pass by the same node or 
branch several times. If all the elements of the succession are distinct from one 
another (except for the first and last point if the path is closed), we say that the path 
is simple. A simpl e closed path is called a mesh. Thus M t = a + /? + c> is a mesh, as 
is M 2 =ry — f]. In a mesh we do not care about the starting point; that is, we regard 
the mesh ft + 5 + a as defining the same mesh, M l , as a + ft + <5. The direction is 
important: —y + f}= — M 2 is not the same as M 2 . We could also consider a third 
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mesh. M 3 = a + y + 5. In some sense, however, this mesh is not independent of the 
other two: if we formally add M t and M 2 , and allow ourselves the option of applying 
the commutative law of addition, we get 

Mi + M 2 = ct + (i + d + y — P = a + y + d = M 3 . 

We shall spend some time in the next section setting up the mathematics to justify 
these kinds of manipulations. In Maxwell’s mesh-current method we choose an 
independent set of meshes (precise definition in the next section) such as and M 2 . 
We i ntroduce unknown currents, J 1 and J ? , flow in g around these meshes. T hu s J x 
flows through R u R 3 and the battery, while J 2 flows downward through R 2 and 
upward through R 3 . If we know the values of J x and J 2 we could immediately 
compute the values of the currents through the various branches. Indeed, since R t 
contributes only to the mesh M x , the current through M x is equal to J u and the same 
goes for the battery. Since R 2 is part of the mesh M 2 only, all the current through it 
comes from the mesh current J 2 and so the current through R 2 is J 2 . The branch 
cont a ining the resistor R 3 contributes positively to the mesh a nd neg a tively to 
the mesh M 2 , and thus th e current through R 3 is J t — J 2 . By Kirchhoff’s voltage 
law, the total change in voltage as we go around any mesh must be zero. Applied to 
M u we get a drop of voltage R t J t across «, a drop of R 3 {J t — J 2 ) across /? and an 
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increase of V across the battery of (5. Thus 

-- R i«/i + R^(J i — J V=Q. 

Similarly, for M 2 we get 

R2J 2 + R${J 2 — Ji) = 0 . 

We thus get two equations for the two unknowns, and J 2 \ it is easy to check that 
these can be solved to yield the same results as before. 

Thus the gist of the mesh-current method is to choose an independent set of 
meshes, and assign unknown currents to them. This then determines the currents in 
all the branches in terms of the mesh currents. W e then apply Kirchhoff’s voltage law 
to each mesh in the form which asserts that the total change in voltage around each 
m^sh must vanish. This gives one equation for each mesh, and hence as many 
equations as there are unknowns. In some sense, the idea of the mesh - current 
method is to introduce unknown currents in such a fashion that Kirchhoff’s current 
law is automatically satisfied, and then to use the voltage law. We defer the precise 
definition of ‘independent’ and the proof of the theorem which asserts that the 
method works (i.e., that the equations have unique solutions for resistive networks) 
to the next few sections. 

We now explain the node-potential method. Since the electric potential function 
is determ in ed only up to an additive cons tant, we may fix the po tent ial at o ne of 
“the nodes to be zero. (If one of the nodes is connected to ground, it is usually 
convenient to s e lect t hi s node as having zero potent ial. ) Us in g our same old network 
as example, let us set the potential at C equal to zero. The potential at A must 
then be V, the potential supplied by the battery. The only node whose potential 
is unknown is B. Let us denote this unknown potential by x. We now apply 
Kirchhoff’s current law to those nodes with unknown potentials. In our case there 
is only one, node B. We can express the current flowing into node B through each 
of the three branches connected to it in terms of the resistances and the differences 
between node potentials. Thus, the current flowing into B through R l is(V~ x)/R 1 , 
the current flowing into B through R 2 is (0 — x)/R 2 , and the current flowing into 
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B through R 3 is (0 — x)/R 3 . KirchhofFs current law states that the sum of these 
three currents equals zero, so that 

(V- x)/R t - x/R 2 - x /R 3 = 0. 

We can solve this equation for x, then determine the voltages across and th e currents 
through all branches. The proof of why this method works in general will be deferred 
to the next section. 

Notice that in our example the node - potential method is superior to the mesh 
method since it involves solving for one unknown rather than two. It is not clear that 
the node-potential method is superior to our original analysis into parallel and 
series connections. 

In dealing with a general network, it is advisable first to check by inspection 
whether it can easily be decomposed into parallel and series connections. If not, it is 
probably not worthwhile to try to figure out such a decomposition. Then check 
how many independent meshes there are and determine how many unknown mesh 
currents. Similarly determine how many unknown node potentials there are. 
(Frequently, symmetry considerations can cut down on the number of unknowns.) 
Choose the method with the fewer unknowns and use it to solve the network. 
Although it is not advisable to choose between the latter two methods by casual 
inspection, a general rule of thumb is that a network with few meshes and many 
nodeswillyieldmorereadilytothemesh-currentmethod,andanetworkwithfew 
nodes and many meshes will yield more readily to the node-potential method. 
Another relevant factor is how the sources are specified. If the network is energized 
by sources having specified voltages, this tends to reduce the number of unknown 
node potentials, and hence favor the node-potential method. If currents are 
specified, this tends to favour the mesh-current method. To allow for all these 
considerations, it is usually best to draw two diagrams of the network and mark the 
number of unknowns of the mesh-current m e thod on one and the number of 
unknowns of the node-potential method on the other in order to make an intelligent 
choice between the two methods. 

So far we have been consid e ring purely resistive networks. We can also apply 
these methods to determine the steady-state (oscillating) behavior of linear circuits 
with inductors and capacitors. (Linear here means that all inductances and 
capacitances are constant.) If all the generators are sinusoidal with the same 
frequency, o)/2tc i.e., all voltage sources are of the form Ve 1(at , and all current sources 
of the form Ie icot , then, as is well-known (and follows trivially from the definitions), 
an inductor with inductance L acts by the law V=icoLI and a capacitor with 
capacitance C acts by the law V— (l/iCa>)/. We can now apply the same rules as 
before with these complex resistances or impedances. In this situation, however, 
solutions need not a lwa ys exist, due to the phenomenon of resonance. Thus, for 
example, suppose that an inductor with L = 1 is in series with a capacitor with C = 1. 
If we put this series together with a generator with co = 1 the rule for adding 
resistances in series gives R — i + 1/i = 0, i.e., a short circuit drawing an infinite 
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current. In the next section we shall discuss some conditions which avoid this 
unrealistic situation. 


12.2. The topology of one-dimensional complexe s 

The terms oriented graph and one-dimensional complex are synonymous. They both 
refer to a mathematical structure that will represent for us the structure consisting of 




the nature of the various branches; this structure will allow us to study the nature of 




A one-dimensional complex is a collection consisting of two sets: a set of zero- 

A,B,...} = S 0 and a set of one-dimensional ob 
or branches {a, jS,...} = , together with a rule which assigns to each branch two 

distinct nodes, the initial point and the final point of the branch. Thus we are 
map from to S 0 x S 0 * In what follows we shall assume that the sets S 0 and are 
finite. 




an integer for each branch, namely the number of times the branch is traversed in the 
path, with orientation taken into account, so that when the branch is traversecffrom 

traversed in the opposite direction the contribution is — 1. Thus each path deter- 
mines a vector p = ( p a ,p fi ,...) T with integ e r coefficients, where the coordinates of 
the vector p are labeled by the branches of the graph; with p a , for example, giving the 
total number of times the branch a is crossed in the positive direction minus the total 
number of times it is crossed in the negative direction. We can also think of a current 
distribution of the network as giving a vector I = (I a , Ip ,.. .) T whose coordinates are 
labeled by the branches, where now I a , for example, is the real number giving the 
current, in amperes, through the branch a (if our one-dimensional complex were the 




introduce me vector space consisting ot all vectors K = (K a ,Kp,...) wnose 
components are indexed by the branches. (Unless otherwise specified, we shall 
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We shall denote this vector space by C t and call it the space of one-chains. We shall 
identify each branch, k, with the vector that has 1 in the k position and zeros 
elsewhere. Thus a = (1,0,0,.. .) T , j3 = (0,1,0,.. .) T etc. 

Similarly, we construct the real vector space, C 0 , consisting of vectors whose 
components are indexed by the nodes and call this the space of zero-chains. Again, 
we will identify a node A with the vector which is 1 in the Atb. position and zero 
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* S 0 x S 0 denotes the Cartesian product of the set S 0 with itself. So S 0 x S 0 is just the collection 
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expression such as A — B makes sense as an element of the vector space C 0 . Notice 
that dim C^ is the number of branches and dim C 0 is the number of nodes. 

W e no w define a linear map called the boundary map , 8, from C i to C 0 . To defin e 
the map 8 it is sufficient to prescribe its values on each of the branches, since the 
branches form a basis for C v Each branch has an initial point and a terminal point 
and we define 8 (branch) = (terminal node) - (initial node). Thus, for example, if a is 
a branch going from A to B, then 8a = B — A. 

Let us examine the meaning of the operator 8. Suppose that* K = 
(K a , K p , K y , .. .) T and 8K = L, where L = ( L A ,L B ,.. .) T . In computing a term such as 
L a , we s ee th at we get a sum of certain of the coeffic ients of K: in fact 

L a = (K Si + • • • + K dl ) - (X E1 + • • • + Kj 

where <5 X ,..., <5, are all the branches which go to A and ,..., e k arc all the branches 
which leave A. Thus, from the example in figure 12.14, 

L A = K p -K rl -K a . 

From this we see that Kirchhoff’s current law has a very simple formulation: 



Figure 12.14 



Kirchhojf’s current law : If I is the one-chain giving the current distribution 
of an electric current, then 






dl = 0 . 



Recall that a simple closed path is called a mesh. In figure 12.15, for example, the 
path a + f + <5, represented by the vector p t = (1,1,0,1) T , is a mesh, as is the path 
— P + y, repr e sented by th e vector p 2 = (0, — 1,1,0) T . Th e sum of th e se two vectors, 
P 3 = Pi + P 2 = (1, ori, 1) T is another mesh, a + y + <5. Clearly, each of these meshes 
has no boundary: 8p t = dp 2 = 8p 3 = 0 . 

For any one-dimensional complex we shall denote the subs pace of C t consisting 
of those one-chains satisfying 8K = 0 by . We express this relationship symboli- 
cally as Z t =ker 8 <= C 1 . We call the elements of Z, cycles. Every mesh is a cycle, but 


* On the preceding page and in much of what follows we write vectors in IR" as (a, b, c,.. ,) T 
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instead of 


in order to save typesetting space. 
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not every cycle is a mesh. For example, the vector 



[f 


I, - 



- *1 

nF, 

3 

\ 2 / 



satisfies <31 1 = 0 but does not describe a mesh since it has entries other than 1, — 1, 
or 0. It does represent a set of currents satisfying Kirchhoff’s current laws. In fact, 
one of t he easie st ways to v e rify that I] is a cy ele4s to wri te in th e appropriate ^ 
current next to each branch and to check thatthe algebraic sum of the currents 

at each node equals zero. See figure 12.16. _ 

Alternatively we could notice that I x = |p x +lp 2 ; since p x and p 2 are elements 
of Z x , so is Ii . In fact, for this simple network, the two meshes Pi and p 2 form a basis 
for Z t . _ 



Havin defined th e subspace Z x c= as being of the kernel of the boundary map 
5, we now turn our attention to the image of 8. We denote by B 0 the subspace of C 0 
which is the image of 8. Symbolically, we may write 

B 0 = 8C l cC 0 . 

We call B 0 the space of boundaries. 
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Figure 12.17 


The significance of B 0 may be made clearer by reference to figure 12.17. The 
element of C 0 . A — B = (1. — 1.0) T which is the boundary of the branch a. lies in the 
subspace B 0 ; so does B — C = (0,1, — 1) T which is the boundary of /?. The sum of 
these two vectors, A — C = (1,0, — 1) T , is again an element of C 0 , and it is the 
boundary of the path a + ft. 

Not every element of B 0 can be interpreted as the boundary of a path, however. 
For example, (2, — 1, — 1) T is an element of B 0 that corresponds to no single path. 

If we consider these elements of C n that do not lie in the subspace B n , we find that 
they do not form a subspace. For the network of figure 12.17, to take a simple 
example, the vectors A = (1,0,0) T , and B = (0,1,0) T are not elements of J3 0 , but their 
difference A — B = (l, — 1,0) T is an element of B 0 . We can, however, form the 
quotient space H 0 = C 0 /B 0 , whose elements are equivalence classes of elements of 
C 0 whose difference lies in B 0 . The vectors A = (1,0,0) T and £ = (0,1,0) T in C 0 
correspond to the same vector in H 0 because their difference is the vector (1, — 1,0) T , 
which lies in B 0 . We denote this equivalence class by A or by (1,0,0) T . Similarly, the 

vectors (2,0,0) T and (0,1,1) T belong to the same equivalence class, (2,0,0) T , because 
their d ifference (2, — 1, — 1) T lies in B 0 . For the simple network of figure 12.17, in 
fact, e v e ry vector in C 0 lies in th e same e quivalence class as a vector of the fo r m 
(a, 0,0) T , where a is a real number. The quotient space H 0 = C 0 /B 0 is therefore 
a one-dimensional vector space, isomorphic with the real numbers. 

For an example of a one-complex in which H 0 is two-dimensional, refer to 
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figure 12.18. The vectors (1,0,0,0) T and (0,0,1,0) T do not belong to same equivalence 
class because their differ ence (1,0, — 1,0) T is no t the bounda ry of any element of C x . 
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adjoin all new branches emanating from these nodes and then all new points at the 
end of these branches We co ntinue this wav until we have no new branches or nodes 
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btain a collection SX4) of nodes and a collection S J A) of branches 


win 


complex, we pick some node B which does not lie in S 0 (A) and repeat the process. 
Proceeding in this way we get a collection of disjoint sets S 0 (/l), S 0 (B), etc., and 
SiiAXS^B), etc., with each of the subcomplexes SoiAXS^AXS^BXS^BX etc., 
connected and none of them connected to any other. We get corresponding vector 
spaces C 0 (AX etc., and direct sum decompositions* C 0 = C 0 (A)@C 0 (B)... and C 1 = 
C 1 (A)QC 1 (B)... with dCt(A) c C 0 (4), gC^B) <= C 0 (B), etc. From this we see that 
H 0 = H 0 (A ) © H 0 (B) -I— and so dim H 0 = the number of summands = the number 
of connected components. 
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gM = 0 . We say that a set of meshes is independent if the c o rresp ondin g elements o 
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finding such a family of meshes. Notice that in view of our discussion of (i) it is 
sufficient to work with a connected complex; if the complex is not connected we 
simply apply our procedure to each connected component separately. Since the 
components are completely independent of one another, this will give the result for 
the full complex. 

A connected complex containing no meshes is called a tree. In a tree there exists at 
least one node which is a boundary point of only one branch. (We are uninterested in 
the trivial case of a complex with one node and no branches.) To prove this fact, 
simply start at any node. If more than one branch impinges on this node choose one 




Figure 12.19 
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still another node. Continue this procedure. Since there are no meshes, we can never 
come back to an earlier node. Since there are only a finite number of nodes, 
eventually we must reach a node which is the boundary of only one branch. 

Suppose we have a tree and we start from a node which is at the end of only one 
branch. From this node w e can build up the w hole tree by adding one branch at a 
time. Since there are no meshes, each time we add a new branch we add a new node. 
Thus, in a tree, the number of nodes is exactly one more than the number of 
if b denotes the number of branches and n denotes the number 
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n = b+ 1. 

We have proved that, for any connected complex, dimif 0 = l. Since H 0 = 
CJB n , dim// 0 = dim C 0 — dimB 0 , and we may conclude that dim B n = 
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that dim(im T) + dim(ker T) = dim V. Applying this to the map 8, we conclude 
that dim(kerd) = dim C x — dim B 0 = b — b = 0, so that kerd = {0}; i.e. Z t = { 0}. 
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cycles. 

Suppose we had any connected complex and built it up as before starting from 
some arbitrary node. Each time we add a branch we may or may not add a new node. 
Eventually, when we attachrall the branches, we witThave added all the nodes since 
the co mplex is connected. Thus fo r a general con necte d complex we h ave 

n ^ b + 1. 

For a connected complex we have already proved that dim H 0 = 1, and we may 


conclude I 


= n — 
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Since B n is the image of 8 and Z, is the kernel of d. we know that dimiT 


dim = dim C : = b or (n — 1) + dim Z x = b. Therefore 


j dim Z v = b + 1 — n _ 


To prove (ii) we must exhibit a family of b + 1 — n independent meshes which form 
a basis for Z x . We do this by first dividing the set of branches Si into a maximal tree T 



owing 


a mesh, and put it aside as a member of the subset T. The remaining branches still 
connect all nodes. Repeat this procedure until no meshes remain. At this point the 
branches which have not been placed in the subset T constitute the maximal tree T. 
This procedure does n ot determine a uniqu e maximal tree; figure 12.20 shows two 
different wavs in which a given Comdex can he reduced to a maximal tree (solid lines) 


by removal of branches (dotted lines) which form part of meshes. 

Since T forms a tree which connects all nodes, it contains n — 1 branches. There 
are b — (n — 1) = b + 1 — n branches in T. Each of these branches in T has its ends 
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Figure 12.20. Two different maximal trees for the same 
complex 


connected by a set of branches in T, since T connects all nodes; and, furthermore, 
since T contains no meshes, this connecting path is unique. Each branch in T, 
combined with this unique path in T joining its ends, forms a mesh. Since the 
number of branches in T, b + 1 — n, equals the dimension of Z v , we now have only 

Tu“prove~iinear independencertet“(xrdmote^he“ith“branohrT>f L X”andletr 
denote the mesh that includes branch a,. Consider a linear combination of meshes: 

Since occurs only in mesh M f , with a coefficient of +1, the coefficient 
of a i in the sum must be c { . Therefore we cannot have Xc f M { = 0 unless all the 
c f = 0, and we conclude that the meshes are linearly independent. 


Trees and projections 

Notice that a choice of maximal tree T determines a projection p T of C x onto the 
subspace Z l5 as follows: 

If ocj e T, then p r (aj = 0. 

If a t eT, then p r (a £ ) = M { . 






The topology of one-dimensiona comp exes 


For example, if the maximal tree consisting of a and f is chosen in the network of 
figure 12.4 the projection operator p T is p r (q) = 0, p T {fi) = 0, p r (y) = a + ft + y, 
p T (§) = — a + < 5. ( Notice that each mesh M f incl ude s the branch q^T with a coeffi- 
cient of +1, not — 1.) The projection p T may be represented by the matrix 
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0 0 1 0 

0 0 1 0 
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o 

o 

o 



If we are dealing with a complex which is not connected we can complete the proof 
of (ii) and construc t th e projection p T by s imply choosing a maximal tree in e ach 
connected component. 


The mesh current method 

We now understand the significance of Maxwell’s mesh - current method: by 
choosing an independent family of meshes and assigning mesh currents J t to each 
mesh, we form an element J X M X + ••• + J m M m of Z x , and the most general 
assignment of currents consistent which Kirchhoff’s current law, i.e., the most 
general element of Z l5 can be obtained in this way. 

In the preceding discussion we have made use of the equation 

1 — dimZx ^n — b - - (12.1) 

for connected complexes. Applying th is to each component of a general complex and 
adding, we get 

dimH 0 —dimZ x =dimC 0 —dimCi. (12.2) 

Since dim H 0 = dim C 0 — dim B 0 and dim B 0 — dim C x — dim Z x , (12.2) is an im¬ 
mediate consequence of the definitions of the various spaces associated with d. 

In Maxwell’s mesh-current method, we think of the ‘mesh currents as determining 
the currents in each branch’. We can give a mathematical formulation to this way of 
thinking as follows: Consider the space of mesh currents just as another copy of Z x , 
call it . That is, H x , the space of mesh currents, is just a copy of Z x , but thought of 
as an abstract vector space, not as a subspace of C x . We shall then consider the 
operation of ‘finding the branch currents determined by the mesh currents’ as a 
linear map, o, where o:H x -+C x is the (identity) map which identifies H x with the 
subspace Z 1 of C x . For example, in the circuit of figure 12.21, the space H x is two- 
dimensional. The map o assigns to the mesh M x the branch currents a +13 + y, so we 
write 

g(M,) = a + ff + y 

and similarly 

a (M 2 ) = y — p- 

Thus, if we use M x and M 2 as a basis in our abstract space, H x ,oi mesh currents, and 
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use oc,j3,y, 5 as a basis of C x , then the matrix of a relative to these basis is 


/I 0\ 



1_1 1 



1 1 — 1 

0 1 



1_i_n 1 


\ 1 



We shall see that Maxwell’s mesh-current method reduces to the problem of 
invertin g a li near map f ro m H x to 4 ts dual spac e, whi ch we shall de note by H 1 . 
Returning to equation (12.2), we can now rewrite it as 






dim H 0 — dim H x = dim C 0 — dim C x (12.3) 



KirchhofT’s voltage law states that all branch voltages can be obtained as 
potential differences from a potential function defined on the nodes. Since only 
differences of potential are significant, we may arbitrarily assign the potential at one 
node of each connected component to be zero. The potentials at the remaining nodes 
then determine all the branch voltages. For example, in figure 12.22, if we assign 
(jy 4 = = 0, then the branch voltages are V a = ® c , V p = <I> B , V s = <D C — <E> B , V 7 = <& E . 

Since the number of nodes is dim C 0 , the number of connected components dim H 0 , 
we see that there are only dimC 0 — dimH 0 independent assignments of branch 

by Kirchhoff’s voltage law is 



dim C x — (dim C 0 — d im /f 0 ) = dimff x . 
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Figur e 12.2 2 

Since the number of independent assignments of branch currents compatible with 
Kirchhoff’s current law equals dim H x , the number of linearly independent meshes, 
we see that Kirchhoff’s current law imposes dim C x — dim H x conditions. Thus the 
two laws together impose dim C x = b conditions. (The laws are independent of each 
other since one refers to current and the other to voltages.) The equations relating 
current to voltage in each branch give b equations. Together we get 2b equations 
for the 2b unknows consisting of the currents and voltages. Kirchhoff’s theorem 
about linear r e sistive circuits asserts that we can indeed solve these equations. To 
prove Kirchhoff’s theorem we must now turn our attention to Kirchhoff’s voltage 
law. 




12.3. Cochains and the d operator 

We first observe that voltage is, in a sense, dual to current; the product of the voltage, 
V y across a branch, y, with the current I y through y gives the power dissipated by y. 


Thus, II we want 10 introduce a vouage 
should lie in the dual space of the space 
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and call it the space of zero-cochains. We now introduce two bits of notation that will 
a ppear somewhat strange at first, but will prove suggestive of some far reaching 
generalizations later on. Given a K = ( K a , K p , ... ) r eC l and a W = (W a , W p , ... )eC 1 
we shall denote the value of the linear function, W, on the vector, K, by j K W; thus 

W =W a K a +W p K p + ■■■. 




zero-chain c = {c A ,c B ,.. .) T by \ c f ; thus 

f = f A c A +f B c B + •••• 

Jc 

Our second bit of notation has to do with the map d. The boundary map, 3, is a linear 
map from C A to C 0 . Its adjoint will be a linear map from C° to C 1 . We shall denote 
this adjoint map by d and c all it the coboundary operator . Thu s, if / is a zero- 
cochain and K is a one-chain, the value of d/on K is equal to f evaluated on dK. in 
terms of the notation we have introduced, we can write this as 


d/= /■ 

K JdK 


(12.4) 


(Our notation has been chosen so as to make the preceding equation look like the 
fundamental theorem of the calculus. Of course, in the above equation there are no 
integrals, just the evaluation of linear functions on vectors in a vector space - i.e.. 
finite sums. However, if we think of/as a differentiable function on the line and d/ as 
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section 15.2.) 

If a is a branch and da = B — A, then the above equation becomes J g d /= 
f(B) — /(,4). Kirchhoff’s voltage law says that there exists a function, d>, on the nodes, 
i.e., a zero-cochain, such that V a = <J>(,4) — <X >(B) for any branch a with doc = B — A. 
Thus Kirchhoff’s voltage law can be formulated as 


V= -d<J>. 
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it holds for all one-chains, since the branches form a basis of the space of one-chains. 
Thus V and — dO take on the same value for all one-chains, i.e., V = — d<J>.) 

An i mmediate consequence of Kirchhoff’s current and voltage laws is the result 


known as Tellegens theorem. Suppose that, for a given network, I eC x is a 
distribution of branch currents satisfying Kirchhoff’s current law, dl — 0. Suppose 


also that, for the same network, there is a distribution of voltages, V, which satisfies 
the voltage law V = -dO. Then the total power dissipated in all branches is 


P = lV«I a = 


V= - 


dO. 


very definition of the map d, 



e 


p= - 

dO = — 

0 = 0, 

% 

i J 



sinc e 81 — 0- This result shows that energy is conserved in electrical networks — 
batteries and generators supply energy at the same rate that it is dissipated in 
resistors. It is characteristic of the power of our notation that no assumptions had 
to be made about the nature of the branch elements. 


In considering currents, we found it useful to introduce two subspaces related 
to the kernel and image of the boundary map d: the subspace of cycles Z, = 
ker^c:C r andthesubspaceofboundariesB 0 — im^c:C 0 .Letusnowcarryout 
a similar program with the coboundary operator d. 

Let Z° cz C° denote the kernel of the operator d. Thus Z° is the subspace of 
potential functions on the nodes with the property that all branch voltages are 
zero. For a connected complex, all branch voltages will be zero if and only if the 
potential is the same at every node. More generally, Z° is the subspace of potentials 
which are constant on each connected component. Adding an element of Z° to 
an element of C° has no physical significance; it amounts simply to changing the 
arbitrary reference level with respect to which any potential is defined. 

Suppose that O is an element of Z°, so that dO = 0. Let <9K denote an arbitrary 
element of the space of boundaries B°. Then 



/% 

1 


o = 

dO = 0. 


<5K J 

'K 


In other words, any element of Z°, acting on any element of B 0 , gives zero. 
Conversely, suppose that J ei O = 0 for all K. Then j\dO = 0 for all K. So dO = 0, the 
zero function on C 1 , and OeZ°. For this reason Z° is said to be the annihilator space 
of 


the quotient space H 0 = C 0 /B 0 , whose elements are equivalence 
classes of elements of C 0 whose difference lies in B 0 . Let and c 2 be elements of 
C 0 whose difference c x — c 2 lies in B#. T hen, for any OeZ°, 


O 


c 1 


o = 


C2 


o = o 


Cl — C 2 





Bases and dual bases 


so that J C1 0 = J C2 ^ ^ c i = c 2 (mod£ 0 ). Thus an element of Z° has the same value 
when evaluated on any member of an equivalence class, so that Z° may be regarded 
as t h e space of linear functions on H 0 . 

This relationship of Z° to C 0 , B 0 , and H 0 is an example of a general principle 
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both the annihilator space of the image of the transformation. It is also the dual 
space of the quotient by the image subspace. Here we have the transformation d. 



which are constant on each connected component, the elements of P° are equi¬ 
valence classes of functions which differ only by a constant on each connected 


component. Thus if we modify a vector in C° by adding on a physically insignificant 




still corresponds to the same element of P°. It is P° which corresponds to the 
space of potentials, when we allow for the arbitrary additive constant in each 





12.4. Bas e s and du al bas e s 

To choose a basis for P°, it is convenient to select one node in each connected 
component as ground, then choose a total of dim C° — dim Z° basis vectors for 
which the potential is unity at one node which is not a ground node, zero at all 
other nodes. For the network of figure 12.23, for instance, we may choose A and 



adjoint of the map [d]: P° -*■ C 1 is the restricted boundary operator [5]: C t -+B 0 . 
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Figure 12.23 


the matrix representing [5] is just the transpose of the matrix representing [d]. 
Relative to this basis, a vector in B n is represented by simply deleting from its 
e xpression as a vector in C 0 the components corresponding to ground nodes. For 
example, in figure 12.23, with ground nodes A and D , we have 


da = B - A 


so that in vector notation 



W \ 0 / 

and 


( -1 0 1 0\ 

i i n o i 


_A_ 
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A -t ‘t A 



i 0 1—1 0 

l 0 0 0 11 

\ 0Q0 -1/ 

but 




We now denote by B 1 the subspace of C 1 that is the image of d. Physically, B l 
is the space of branch voltage vectors that obey Kirchhoffs voltage law. Further- 
more, by Tellegen’s theorem, which we proved earlier, B 1 is the annihilator space 


of Z{. any voltage distribution obeying Kirchhoff’s voltag e law, applied to any 


current distribution obeying Kirchhoff’s current law, gives zero. If we form the 
quotient space H 1 = C 1 /# 1 , then, it may be identified with the dual space of H x 


(also known as Z t ). The proof is simple. Let V t and V 2 be vectors in C 1 whose 
difference lies in B 1 , so that \ t — V 2 = — dO for some O. Then, given an arbitrary 
current vector IeZ l , for which 81 = 0, 
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T_ 
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so that JjV! = J,V 2 . Thus and V 2 , which correspona to the same element ol H , 
are also eaual as elements of the dual space of H,. 

We may sumi 

narize the relationship among C°, C 1 , Z 1 , B 1 , and Z x as follows: 


The transformation d acts on C°, and maps C°->C 1 ; 


r 
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The kernel of d is Z u which has as its annihilator space B 1 = im d and also 
has as its dual space H 1 = C 1 /# 1 . 


12.5. The Maxwell methods 

We earlier considered a procedure for constructing a basis of meshes for Z x by 
choosing a maximal tree. We denote by a the map that identifies each vector in 


basis chosen for H t , and denote by s the map that projects the space C 1 onto the 
nnotient space H 1 = C l /B l . Then s and o are adjoint transformations, and their 


Constructing the matrices of a and s is easier than it sounds. Consider the network 
of figure 12.24, with a maximal tree consisting of a and y. Then a basis for Z, is 




so that the matrix of a is 



Note that 



so that a provides the rule for expressing branch currents in terms of mesh currents 
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Figure 12.24 








Note that 





V* 

(if + v p + 

s 

v y 

_s 

V -v y + v* ) 




so that the components of sV simply equal the sum of the branch voltages around 
the various meshes we choose as a basis for H t . 

We may summarize all the above relations in a single diagram: 



[d] s 



[5] <r 


B 0*— Cj_ <— JFfj 





In this diagram, vector spaces in the same column are dual to one another: C 1 
and C l5 P° and B 0 , H 1 and H t . Furthermore, [d] and [_&] are adjoint, as are s 
and a. Both fd~| and a are injective (have zero kernel) while their adjoints \_ff] and 

s are surjective (have the entire indicated vector space as their image). Finally, 
im a = ker [5] while im [d] = ker s. With this scheme we can write both of 
Kirchhoff’s laws in a symmetrical form: 


Kirchhoff’s current law: I = oJ, J eH t ; 
Kirchhoff’s volt age law: V — — [d]<fr. QeP 0 . 


irchhoff’s laws— 


We are now in a position to discuss Kirchhoff’s theorem on resistive circuits. We 
shall assume that we have an electric circuit made up of resistors for which the 
operating characteristic of each branch a is a straight line in the ( I a , F“) plane. We 
shall also make the realistic (and simplifying) assumption that this line is not 
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parallel to either of the axes. This means that any voltage source is in series with 
som e resistance and any current source is in parallel with some resistance: there 
is no source that can supply constant voltage no matter what the current drawn 
and no source that provides constant current no matter what the voltage. We thus 

assume that each branch « with da = B — A looks like figure 12.25_ 

The voltage across the resistor is V a — W a and the current through the resistor 
is I a — K a . Thus the characteristic of the system is given as 

{V“-W*) = z a {I 0l -K a ). 

He re W a , (either of which might he zero) are given, as is z a ^0. Tn a purely 
resistive circuit z a is a positive real number. We can also consider the case of the 
steady state of a linear circuit containing capacitors and inductors, in which case 
7 ^ is a complex number depending on the frequency. The I a and V a are unknowns. 
We let Z: -> C * 1 be the linear transformation whose matrix (in terms of the basis 
consisting of the branches) is the diagonal matrix with entries z a . We can then 
write the above as 

v - W = Z(I - K). 

Combined with Kirchhoff’s laws we obtain the equations 

V — W = Z(I — K), I = crJ, V = — [d]® I 

for the unk nowns Jeff t and ®eP° , where W, K and Z are given. We ca n try t o 
solvetheseequationsbyeitherofthefollowingtwomethods. 

The mesh current method 

(i) Write I = a J to insure that Kirchhoff’s current law is satisfied, then apply s 
to the equation Y — W = Z(crJ — K). Since, by Kirchhoffs voltage law, sV = 
s( — [d]®) = 0 we obtain — sW = sZa J — sZK or (sZcr)J = s(ZK — W). The right- 
hand side of this equation is given, as is the linear transformation sZa. If sZa is 
invertible we obtain 

J = (sZo)~ 1 s(ZK — W) (12.5) 

from which all currents and voltages may be obtained by I = crJ, V = W + Z(I — K). 
This is Maxwell’s mesh-current method. It depends upon inverting sZa. 

The node potential method 

(ii) Write V = — [d]® to insure that Kirchhoff’s voltage law is satisfied. Then 
invert Z, which is just a diagonal matrix, to obtain I —K = Z 1 (—[d]® —W)r 
Now apply [5] to both sides. Since, by Kirchhoff’s current law, [5] I = 0, we 

— obta i n - 

- [d]K = - [5]Z“ ^d]® - [d]Z _1 W 
or 

mz-'l d])® = [3](K - Z^'W). 




( 
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If [d]Z *[d] is invertible we obtain 



This is Maxwell’s node-potential method. It depends upon inverting ([d]Z ^d]). 


12.6. Matrix expressions for the operators 

Equations (12.5)-(12.7) give the essence^of Maxwell’s methods. To make them 
work in practice we need matrix expressions for the onerat 


choices are made and work out some examples. The proof that the methods always 
w ork for resistive networks w ill be given at the begin ning o f Section 12.7. 

For the spaces ,8 0 ,Ff 0 ,Z 0 , and P°, a basis follows from a choice of a ground 
node in each connected component of the network. For example, in the network of 
figure 12.26, we might choose node A as ground in one connected component, node 
E in the other. Then a basis for B 0 consists of the boundaries of paths joining ground 



basis elements are 



B 0 to a basis for C 0 by choosing basis vectors which have 1 in the position of one 
ground node, zero elsewhere, and we take the equivalence class of each of these. In 











P° = C°/Z°, we simply choose potentials which are 1 at each non-ground node 


in turn, and take the e 
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These equivalence classes are clearly dual to the basis {b 1 ,b 2 ,b 3 ,b 4 } for B 0 which 
we constructed earlier. This is why the matrix representing [dj: P°-*C° is the 
transpose of the matrix representing [5]: C 0 -+B 0 . The columns of [d], and the rows 
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also the quotient space G 1 = C 1 /Z 1> which is dual to B 1 . For all these spaces, the 
choice of basis is governed by a choice of maximal tree, such as branches a, /?, e in 

figure 12.27. _ 

The basis elements for Z ls as we have already seen, are the meshes associated 



vonage law, we may construct a oasis oy assigning unit voltage to one orancn in 
the maximal tree, zero to the other branches in the maximal tree. This assignment 
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be used to determine the voltages across the non-tree branches. Equivalently, since 
each bran ch which is not i n the maxi mal tr ee is asso ciated with a u nique mesh. 




in turn to determine the voltages across the non-tree branches. For the example of 
figure 12.27 this procedure leads to basis vectors associate d with a, respectively: 


b 1 = (1 0 t -1 0), 
b 2 =(0 1-1 0 0), 

b 3 = (0 0 0 -1 1). 


These annihilate the meshes nij and m 2 , and they are dual to the basis g l9 g 2 , g 3 for 
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in the tree. They are equivalence classes of voltage vectors which have + 1 in the 

. . f* i i -i 


for Hi are thus _ _ 

_A)\/q\ 

-0- 0- 

h 1 = 1 , h 2 = 0 , 


which correspond to unit violations of Kirchoff’s voltage law in mesh 1 and mesh 2 
respectively. Clearly h 1 and h 2 are dual to the meshes m t and m 2 .Because of this, 
the matrix representing s.C 1 -*# 1 is the transpose of the matrix representing 

a:Zi-+Ci. 

An alternative method of choosing basis vectors for the space B 1 of voltages 
irchhoff’s voltase law is sometimes convenient. We simnlv choose th 




and apply d to the resulting potentials. (Equivalently, we take the potential as — 1 
in turn at each non-ground node and write down the branch voltages.) In the present 
example, with node A chosen as ground, this leads to the basis vectors 

a 1 = (0 -1 1 0 0)T 

a 2 = (1 1 0 -1 0), 

a 3 = (0 0 0 1 -1),~ 

associated with nodes B, C, and D respectively. As dual basis vectors in Gj = CJZj, 
we must choose current distributions in which unit current flows from the ground 
node along the tree to each non-ground node in turn. In the example, this leads to 





A 


( 0\ 
A 



1 


yj 




ki = 

0 

0 

> k 2 — 

0 

io_ 

» k 3 = 

0 

0 

9 


L-qJ 


loj 





which correspond respectively to unit violations of Kirchhoff’s current law at 


Let us now apply both of Maxwell’s methods to the circuit of figure 12.28, which 
has the topology which we considered earlier while analyzing figure 12.24. Wemust 
first make an arbitrary choice of which node will be ground: let us select node A, 
so that <D 4 = 0. The branch voltages are now expressed in terms of the remaining 
node potentials as follows: 

v a = - <D b v p = 0> B - O c , V y = O c , V s = <D C . 





le leory o e ec rica networks 




As a ch eck on this calculation, w e form the m atrix d . Since da = B — A; 83 = C — B: 
dy = A — C; S3 = A — C, we have 


/-I 0 +T 

5=1+1 -10 

\ 0 +1 -1 

To form [5], we simply delete the row of d which corresponds to the chosen ground 
node A. With the first row thus deleted we have 




With the meshes M x and M 2 chosen as indicated in figure 12.28, we see that the 



terms of the branch voltages. 








Around mesh 1: S x = V a + V p + V y ; 
Around mesh 2: S' 2 = — V r + V d . 

In order to have $ = sV, 


v A 

1 

1 

0^ 

^w 

0 

-1 

If 


As expected, s is the transpose of a. 

The next step is to write down the vectors W and K which represent the voltage 
and current sources in the four branches. The sign convention for W is that a battery 
voltage is considered positive if the battery contributes positively to the voltage 
drop when the branch is traversed in the sense indicated by the arrow. Referring to 
figure 12.28, we find 

W a = — 8, W fi = 0, ^=+11, W d = 0. 

Similarly, a current source counts as positive if it contributes positively to current 
flow in the direction of the arrow, so that 

K a =+X K e = 0, K y = 0, K d =+ 1. 


We can therefore describe the sources in the network by the two vectors 




(3\ 


w = 

0 

11 

and K = 

0 

0 



OZ 




Finally, we write down the matrix Z which describes the resistances. Since 


R a —2, Rp —I, R y —3, R d — 4, 


we have 


/ 2 0 0 0 \ 

° 1 ° ° 

— 0 0 3 0 ^ 

y0 0 0 4y 

To apply the mesh-cu r rent method we use the equation 

J = (sZa)~ 1 s(ZK - W). 


Matrix multiplication gives 


sZa 





(2 0 0 0 \ 
0 10 0 
0 0 3 0 
\0 0 0 4/ 
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so that 
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We also find 
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M 

/ — 8 \ 


/ 14 \ 
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1 J 
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so that 

' 14^ 


s( zk-w)=(‘ * ®) 
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-CD 
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Then, at last, 


1 / 7 3' 


1 / 66 N 


J = 


33 \ 3 6 A 15 


33V 99 


so the solution is = 2, J 2 = 3. Finally, 








I Vi 





If you look back at the calculation, 
fact hold for any network. 


which constitute a mesh. For example, mesh M ls consisting of branches 
«, /?, and y. has a total resistance of 6 ohms, the upper left entry of the matrix. 


(2) In sZo, each off-diagonal entry is (up to a + sign) the resistance common to 
two meshes. In the circuit which we have been considering, meshes 1 and 2 
have 3 ohms in common. The — sign arises because y occurs in M t and M z 


with opposite sign. 


(3) The components of (sZ<r)J are the voltage drops which would exist around 
each mesh if there were no sources. With J 1 = 2, J 3 = 3, you can check, for 
example, that there would be a net drop of 3 volts around mesh 1. 


(4) The components of ZK — W are the voltage rises which would exist in the 
branches if all mesh currents were zero. For example, if J x were zero, there 
would be a current of 3 amperes down through the 2 ohm resistor in branch 
a and a rise of (2-3 + 8) = 14 volts across a. 

(5) The components of s(ZK — W) are the net voltage rises which would occur 
around the various meshes if all branch currents were zero. For example, 
with J t =J 2 = 0, there would be a total rise of 3 volts around mesh 1. 


(6) The equation (sZa )J = s(ZK — W) therefore states that, for each mesh, the 
sum of the voltage drops caused by the mesh currents flowing through the 
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currents were all zero, so that the total voltage drop around each mesh, due 
to both the mesh currents and the sources, is zero as required by Kirchhoff’s 



To apply the node-potential method, we use the equation 



Matrix multiplication gives 





so that the unknown node potentials are O b = 10, d> c = 


Look back at this calculation to confirm the following points, which again are true 



(1) Each diagonal entry in [d]Z 1 [d] is the sum of the reciprocal resistances 


asmsmmmm 





ohms connected to it, and the upper left entry is 




laeonai en 


unjMMimaiM!] 




is tup to- a 

the resistance in the branch which joins two nodes. Since nodes B and C are 
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The components of [^] Z~ 1 [d.] O represent the net current which would 
flow out of e a ch node if there were no sources . For example, with d> A = 0, 
<D B = 8 and no sources, there would be 5 amps flowing from B to A through 
the 2 ohm resistor, and 2 amps flowing from B to C through the 1 ohm 
resistor a total of 7 amps. You can check that the first compon ent of the 


(4) The components of K — Z " 1 W are the branch currents which would exist if 
all node potentials were zero. For example, if both <b A and d> B were zero, 
then there would be a current of 4 amperes through the 2 ohm resistor in 
branch a and a total branch current of 4 + 3 = 7 amps. The first component 
current of K — Z~ 1 W is 7. 

(5) The components of [d](K — Z~ 1 W) are the net currents which would flow 
into each node if all node potentials were equal to zero. For example, in the 

r* o co rM — rh® — <T>C 7 amt-vc intn nriHr* R tVirr»iirrh Kranrli nft 


woulcTbe 7 amperes. 

(6) The equation [5] Z “ 1 [d] = [5] (K — Z _ 1 W) states that for each node, the 
sum of the currents out of each node which flow through the branch 
resistances as a result of n ode poten tial differences equals the s um of the 
currents which would flow into the node if all node potentials were zero, so 
that the total current entering each node, due to both the node potentials 
and the sources, is zer o in accordance wit h Kirchhoff’s current law. 

If you compar e th e six stat e m e nts about the node - pot e ntial m e thod with th e 
six about the mesh current, you will notice a remarkable duality. Replace node by 
mesh, current by voltage, and resistance by conductance, and they are the same. 

One feature of the above example that does not hold in general is that the size 


oi tne matrix tnat naa to oe invertea was me same in ootn cases, i nat nappenea 
because there were two in dependent meshes and two non-gro und nod es. In general 
one method may involve inverting a larger matrix than the other. 


M i iu\w i ■ HHIlIHI 


U mu m m w 11 h 


network. The issue is whether the mapping sZo can be inverted. Since sZo 
is a map from the space H t (mesh currents) to its dual, H l (voltage around meshes), 


i i i fi i*j S ki ivj ■ p 


injective; i.e., that its kernel is zero. Consider any non-zero element J in H 1 . Its 
image (sZ<r)J is an element of the dual space H 1 , and we wish to show that it is 
not the zero element. We do this by simply evaluating it on the element J; since 
s is the adjoint of a we have [(sZojJ](J) = [Z<tJ](o-J). But <rJ is just the vector 
of branch currents I. while Z<rJ = ZI. Hence _ 

_ r _ 

[(sZojJ]J = (ZI)I - ZJ = Z h 1 * 
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where the sum is over all branches. If the matrix Z has only positive diagonal 
entries, then jjZI > 0 unless 1 = 0. But, since a is injective, I = erJ = 0 
implies J = 0; that is, the branch currents are all zero only if all mesh currents are 






inductors, the above procedure gives us some interesting information. For a 
network with b branches, Z will still be given by a b x b diagonal matrix, whose 
entries, the impedances of the branches, may be functions of frequency (o/2n. If 
each branch contains only one capacitor or inductor, each impedance will be icoL 
or — i/coC. If there are m independent meshes, the matrix of sZa will be an m x m 
matrix whose entries are functions of co, and its determinant will give rise to a 






inv ertib le if P (a) 2 ) = 0, which means that there can be at most m values of w 2 for 




iiiVMiiiiiri 


niMiiiruarMi 




Mutfiid rt iJiMmi wil» ft Prs** TM rzk 


the system. For each of these resonant frequencies, the equation (sZcr)J = 0 has a 
non-ze ro solution even thoug h there are no sour ce terms on the ri ght; that is, 
currents can flow even though there are no voltage or current sources. The solutions 
of this equation determine the normal modes of the system. A similar argument, 
applied to the matrix of [5]Z “ 1 [d], whose size is (b — m) x (b — m), shows that there 
can be at most (b — m ) resonant frequencies. So to find the resonant frequencies 
of a network, we simply form the matrix of sZo or [d]Z ~ 1 [d], whichever is smaller, 
and calculate for what values of co 2 its determinant is zero. 

As an illustration of this method, consider the network of figure 12.29. This has 











The matrix Z now involves the impedances of the various components: 

z a -- 2i eoL,Zp= — i /eoC,z y = 1 eotr,z d = - i/2coC. 

Then / \ 



n 0 0 


Z = i 

0 0 coL 0 



ru o o — (zcocj 7 


Matrix multiplication yields 

sZa = i( 


(coC) 1 — c oL 

eoL eoL — (2coC) _ 1 


Det(sZo-) = -(3eoL-(eoC)- 1 )(eoL-(2eoC)~ i ) + co 2 L 2 


D(eo 2 ) = 2L 2 co 4 - 5Leo 2 /2C + (2CT 1 = 0. 

This factors readily: 

(Leo 2 - C~ l )(2Leo 2 - (2 C)~ x ) = 0. 

The resonant frequencies of the network are therefore given by 

co 2 - Y/LC and co 2 ^T/4LC7“ 

To find the corresponding normal modes, we solve the equation (sZer)J = 0 for each 
resonant frequency. If co = ^/(1/LC), for example, 

s ^-t 3 7(W17(W7 

and a solution to (sZcr)J — 0 is which means that the current in mesh 2 is 
twice the current in mesh 1. Similarly, setting co — 1/2 *J(LC), we have 

, 7ff = J-iTWQ ~^(L/C)\ — 

_ v-W (l/c) -y(L/oj _ 

so that a solution to (sZa) J = 0 is ^ j which corresponds to a normal mode of 

frequency eo/2% where the mesh currents J, and J 7 have the same amplitude but 
opposite phase. 




requencies of circuits containing just inductors and capacitors, may be employe 




*. eac y-s a c c < 


( ( 


also to analyze steady-state alternating-current circuits. We assume that all voltage 
and current sources have the same frequency co/2n, so that branch currents are of 
the form 



and the branch voltages are of the form 
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simplifies to 

p _l p t 2 


* a 2*^1 cl * 


For a branch consisting simply of a capacitor or inductor, whose impedance z a is 


. 1 1 • • 4 1 # 



As an elementary example of a steady-state circuit, consider the so-called low-pass 
filter shown in figure 12.30. We may regard this circuit as having two branches. 

A basis for Z t is the mesh a + jS, so that <r = Q\ s = (1 1). Since Z = 






current is simply 



This voltage decreases from $ when co = 0 to zero as co -> oo, hence the name 



We may make the preceding circuit slightly more complicated by placing an 


les wi 








z L = ic oL — (1/coC), and the mesh current is J = $/{R + i(coL — (c oC) 1 )). This mesh 
current clearly has its peak value if L = 1 /coC or co = yJ(l/LC). This is the so-called 




Steady-state circuits and filters 


Figure 12.31 


Figure 12.32 


Z consists of the meshes M x = a + /? and M 2 = a + y, so that 

~ V] a P- 

a = 1 — 1 and s = 

_ jo J V° - 1 ^ 

^0 0 \ 

Since Z = 0 ic oL 0 , we have 


/ „ \ A 0\ 

imL = 
\0— ic oL — i/coCj l 0 ~ 


— icoLicoL — i/co 


(sZa)-^---7 

R flcoL-i/coQ + LCT 1 \ 

To determine the mesh currents, we write that 


icoL — i/c oC ic oL 


J = (sZ<j)-H-sW) = 


f\coL — \/(oC\ 
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Of interest here is the solution for the resonant frequency co/2n, where coL = — 1/coC 

and J — -—f ) so that there is no current in mesh M, and no power 
L/C\i<joLj - 

dissipated in the resistor. 

Of course our techniques permit us to analyze alternating-current circuits of 
arbitrary complexity just as easily as direct-current circuits with the same topology. 
Consider, for example, the following two-stage filter. 
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Figure 12.33 


Since this circuit has three meshes but only two non-ground nodes, it is easiest 
to analyze it by the node-potential method. We choose node A as ground node, 
so that 



p 


[S] = (o "o '! -! -°i) and M- 

1 o 1 
-l i 

i o -i 1 

■ 




Then [g]Z~ 1 [d] = 


7—i/coL 0 \ 

/ a \ 


i 

fi -1 -1 o try 

ic oC 0 1 


! 

{0 0 1-1-1/ 

i/ioL — i/coL 

0 i coC 



Lq -1 /R J 



/ ic oC — 2i/c oL i/coL 

\ i/coLic oC — (i/coL) + l/R 


Using the formula 


Q = ([g]Z~ 1 [d])~ 1 [g]Z~ 1 W 

it is a straightforward matter to invert this matrix and determine the potentials 
at nodes B and C. We content ourselves with solving the problem for the resonant 
frequency at which coC = 1/coL. Then the matrix to be inverted is simply 


raz-^d] 


— i coC 
i coC 
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By noting that [S]Z *W = 
immediately 



i^coC\ 
0 ) 


at this frequency, we see 




i 


l m 


i(oC\/i(oC " 


\0 C / (— i coC/R) — co 2 C 2 \ — icoC— icoCJ\ 0 


or 


'4 1 




<f> c J (—icoC/R) — co 2 C 2 \co 2 C 2 


Summary 

A One-complexes 

Given a diagram representing a one-complex, you should know how to construct 
a maximal tree and the associated basis of meshes for Z, and write the matrices 
representing a, s , d, and d. 

You should know the definitions of the subspaces B^H^Z^H 0 ,? 0 , and B l 
and be able to construct bases for them. 

B Resistive networks 

You should be able to write down the relati on betwee n V and I for a branch 
containing a battery, current source, and resistor. 

You should be able to derive and apply Maxwell’s mesh-current and node- 
potential formulas for solving electrical networks. 

C Alterna ting-current networks 

You should know how to use Maxwell’s methods to find steady-state solutions 
and normal modes for networks involving inductors, capacitors, and sinusoidal 
voltage or current sources. 


Exercises 

12.1,2. For figures 12.34 and 12.35, do the following: 
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Figure 12.34 
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(a) Find an independent set of meshes, and write down the element 
of C x corresponding to each mesh. (You may use notation like 
Mj = a + P — y.) Check that the number of independent meshes, m, 
satisfies m = h + 1 — n. 

(b) Express the current in each branch in terms of mesh currents, then find 
expressions for the branch voltages. 

(c) By applying Kirchhoffs voltage law to each mesh, construct a set of 



(d) Choose node A as ground and express voltages and currents in terms 
of the remaining unknown node potentials. By applying Kirchhoff’s 
current law to each node, obtain a set of simultaneous equations for 
the node potentials, and solve them. 

12.3,4. For figures 12.36 and 12.37, do the following: 
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Figure 12.36 


(a) D e t e rmine the dimension of the spaces C u Z t , C 0 , and H 0 . Check that 
dim H 0 — dim Z t = dim C 0 — dim C u 

(b) For each connected component, find a tree T which connects all the 
nodes. Using these trees, construct a basis for Z l5 and write down 
explicitly the projection p T in a form such as 

Pr(a) = 0, 

p T (P) = p + y — S, etc. 

(c) Determine how m a ny mesh equ a tions a nd how m a ny node equations 
would be required to solve a circuit with the given topology. 

12.5. For each of the three branches shown in figure 12.38, determine the 
relation between V and I. In each case, construct a branch with the same 
relation by using only a suitably chosen battery and resistor (but no 
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current source) and also by using a suitable current source and resistor 
(but no battery). 

12.6. Consider the complex shown in figure 12.39. 

(a) Write down the matrices d and d. Take the order of branches to be 
«, /?, y, S, e, (j>, even though these are not the first six letters of the G r eek 


alphabet! 



operator [d]:P°-^C 1 relative to this basis. 

(c) Select a maximal tree in each component by choosing branches a, S, 


combining the remaining branches /?, y, and with branches in the 
tree (always choosing the mesh so that /?, y, or S has coefficient + 1). 
Write down the matrix which represents a: Z x C x relative to this 


SasisT 


(d) Show that the columns of the matrix [d] determine a basis for B l . 
Associate vectors of the quotient space H 1 = C l /B l with the three 
branches not in the maximal trees, obtaining 



Show that these three elements form a basis for H l which is dual to the 
basis chosen for Z v 

(e) Relative to this basis for H 1 , construct the matrix s: C l -+H 1 . Check 
that s is the transpose of a. 


(f) Show that the boundaries of the paths joining ground nodes to non- 
ground nodes form a basis for B 0 which is dual to the basis chosen for 
P°. 

(g) Show that the equivalence classes of the ground nodes, 
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form a basis for H 0 = C°/B° which is dual to the basis chosen for Z°. 
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12.7- For Exercise 12.1 construct [d],[<9],<j,s,Z,K, and W. Set up the mesh- 
current and node-potential equations by using these matrices and vectors, 
and check that the equations are the same. Let node A be ground and let (1 
and 8 be the maximal tree. 

12.8. Sol v e Exercise 12.2 by the same method, with A as the ground node and 

a, p, and 8 as the maximal tree.- 

12.9. Use the mesh current method to determine the branch currents I p ,I y for 

the network in figure 12.40. Use the meshes defined by choosing ft as a 
maximal tree. 
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Figure 12.41 


12.10. Consider the electrical network in figure 12.41. Suppose that the matrix 
representation of s is 

- +1 0 +1 \ - 

V o o+i - 1 / 

Indicat e cl e arly what basis for Z t is impli e d by this form of s. What 
maximal tree would lead to this basis? Suppose that a basis for P° has been 
chosen as follows: 


P 1 


is the equivalence class 
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12.11. Consider the electrical network shown in figure 12.42. 



” Write down the matrices d and d for this networkTBranches a and y 
constitute a maximal tree for the network. Write down the basis for Z x 
which corresponds to this maximal tree, and construct the matrix a 
relative to this basis. Use the mesh-current method to determine all the 
branch currents for the given network. 






for the network in figure 12.43, i.e., determine the values of co/ln for which 
sZg is singular, and find the solution of (sZa)J = 0 for these frequencies. 


ground node, you still get a: 
normal modes? 


i matrix to invert. Why are there only two 


C L p B L 8 D 



Figure 12.43 


(cl Convert the same network to a network with only two nodes b\ 


combining a and /? in series and 5 and s in series, thereby eliminating nodes 
C and D. Solve by the node-potential method. This time you only have to 
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12.13. ( a ) Use the mesh-current method to determine the normal modes of 

oscillation of the network shown in figure 12.44. For each mode, 
expr e ss th e frequency wflit in terms of the quantity a> 0 = 1 yJ(L 0 C 0 ), 
and determine the ratio, J 2 /J u of mesh currents. 

(b) Use the node-potential method to find the normal mode frequencies in the 
same network. 

12.14. (a) Construct the complex-valued matrix sZa for the network shown in 

_ figure 12.45. _ 


C B 



(b) Suppose that the generator in branch a supplies a voltage which is the real 
part of i 0 e ifOt . Find expressions for the steady-state voltages V p and V s . 

12.15. Suppose that a current source that supplies a current K = Re(K 0 e icaot ), 
where co 0 = 1/ yJ(L 0 C 0 ), is connected in series with the capacitor between 
nodes A and B of the network of Exercise 12.13. 
mesh currents J 1 and J 2 . 
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In Chapter 13 we continue the study of electrical networks. 
We examine the boundary-value problems associated with 
capacitive networks^ and use these methods to solve some 
classical problems in electrostatics involving conductors. 


13.1. Weyl’s method of orthogonal projection 

We turn now to two other methods of dealing with the network problem, Weyl’s 
method and Kirch hoff’s original method. Both of these methods are restricted to the 
purely resistive case; i.e., all the z a are positive real numbers, z a = r a > 0. However, 
they are bot h interesting and worth studying. We begin with Weyl’s method. 

The equation relating branch voltages V and branch currents I is _ 

_V - W = Z(I- K) 

which we can rewrite as 

V = W + Z (I - K ) 

orlis 

V = Z(Z ~ 1 W — K +1). 


We now use the matrix Z to define a positive-definite scalar product on the space 
Cj of branch currents by 


_ 


(UO z = 

ZT = r a IJ' a + rplpl'p -1-. 


Suppose now that w e hav e any current distribution F satisfying Kirchhoff’s 
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current law dl = 0 and any voltage distribution V satisfying Kirchhoffis voltage law, 
cn that V = - d<J>. We kn ow then tha t f r V = 0. Since V = Z(Z _ 1 W +1 - K), we m ay 


J Z(Z" 1 W + I-K) = 0_ 

Expressed in terms of the scalar product defined above, this equation is 

_ (r, z~ 1 w + 1 — k) z = o. _ 

Thus we wish that the vector (K — Z - 1 W) — I be orthogonal to all cycles. 

We can thus reformulate the resistive network problem as follows. We are given 
->ace Ci of branch currents, in which lies the subspace Z, of cycles. For a 


network with specified sources W and K, we can form the vector K—Z *W, which 




to zero. In general this vector is not an element of the subspace Z Y . We wish to find a 
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13.1: given the vector K — Z 1 W in the space C l5 we must compute its orthogonal 




relative to the scalar product defined bv Z. Then 



let n denote the linear transformation which projects orthogonally onto Z x , then 

the solution to the network problem is 












ejection 


Th en, if u denotes an element of C,, its projection onto Z, is given b> 


= 1 , U|C 1 -r u ; c 2 
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independent family of meshes, and then use the Gram-Schmidt procedure to 
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calculate its projection onto Z x by taking the sum of its projections along the 
orthonormal basis elements of Z 1 . Each such projection determines a column of the 


matrix n. 

As a simple illustration of Weyl’s method, we consider the circuit shown in 
figure 13.2 . For the circ u it, C 1 is two-dimensional. There is only o ne me sh, a + /?, so 

Zi is the one-dimensional subspace spanned bv the vecto r i \ Y The matrix Z is 


0 3 


To compute the projection matrix n we must project the two basis vectors ( ) 

- f0\ . 

and I 1 m turn onto the subspace Z x . We first normalize the basis element of Z x \ 
the length of ^ j ^ is^j^j,^ ® ^ j ^ = 4, so we divide ^ j ^ by ^/4 = 2, obtaining 
which has unit length. Then the projection of is 

(I'M 1 o\A\\/A n\ 


t \v~\i 


while the projection of ( ) is 


i oV*Wf 


v2// \2, 





Therefore n = 3 J • For the network of figure 13.2, K = ^ j J and W = ^ Q J, so 

that K-Z" 1 W = Q^-^ o) = (l)' Applying 7 1 to this vector, we find 

immediately 

i =ti«k— z _i w ) =(| *)(;)-Q 


e solution is !„ = !„ = 2. Notice that K - Z *W —7 



It is easy to check that n 2 = n, as must be true for any projection operator. 


13.2. Kirchhoffs method 

We turn finally to Kirchhoff’s method. Although this method was invented many 
decades before Weyl’s, we may regard it from our present point of view as providing 
an explicit formula for the projection operator n in terms of the trees of the graph. 
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Suppose that we have a connected complex, and let T denote a maximal tree. 
Recall that for each such tree we have defined a projection operator p T by 


p r (a) = 



if aeT, 
if a$T, 


whe r e M g is the cycle corresponding to uniqu e mesh containing b r anch « whose 
other branches all lie in T. (The cycle is to be chosen so that it contains + a, not — a.) 
1e, in figure 13.4, the branch /? alone forms a maximal tree. Then 

p r (oc) = a-ft p T (p) = 0, p T (y) = y + p, 


& 


^ i 

/ 

\ 

/ 

\ 


0 t y 

\ 

T~ 

\ 

/ 


A 


Figure 13.4 



f 1 0 0 ^ 


sothatthematrixrepresentingthisprojectionp r is 

-L0 1 

l 0 0 \) 

.Noticethatthe 


diagonal entries of p T are all +1 or zero. 


Any such p T is a projection operator with range Zf. its image is Z l5 and p T (M) = 
M for MeZj. It is not, however, an orthogonal projection operator, because its 
kernel is not orthogonal (with respect to the scalar product defined by Z) to the 
subspace Z x . To put it differently, p r is not self-adjoint: for a pair of branches a and /? 

(Pr^T^z^t^TPr^)z 

in general. 

While p T for a single maximal tree is not self-adjoint, there are many different 
maximal trees in a complex. What Kirchhoff discovered was a scheme for taking 
a weighted average of the various projection operators which is self - adjoint. 
Suppose that for each maximal tree T we have a real number 1 T , with 0 < X T ^ 1 
and = 1, where the sum is over all trees. Then 

Px = X^rPr 
r 

is again a projection onto Z y . its image is Z l5 and it maps any vector in Z x 
into itself. In general, p x is not orthogonal, but there is one choice of X T for which 
Px is o rthogonal. Kirchhoff’s prescription for this choice is to define, for each 
maximal tree T y 


Qt — Y\ r P 






the product of the resistances of all the branches not in T. The weighting factors X T 



l T = Q t /R where R — 

r 

Thus we have an explicit formula for the Weyl projection operator n, namely 






n — R 2lQtPt- 
t 



We mu st show that thi s opera tor is self-adjoint, or, wh at amounts to th e same 
thing, that Rn = Y.tQtPt is self-adjoint. Since the branches form a basis for C u it 
suffices to show that for any pair of branches, « and /?, we have 


X Q t(P r a 5 P)z — X Q r( a > P tP)z • 

T T 

For fixed a and fi, as we sum over all maximal trees, three cases can arise, as shown 
in figur e 13.5. 



Case 1 Case 2 


a 



Fi gure 13.5 


Case 1. Both a and are in the tree. In this case, p T («) = 0 and p T (P) = 0, so the tree 
contributes nothing to either side of the equation. 

Case 2. Neit her « nor /? is in the tree. In this case p r (a) is a mesh which does not 
contain /?, so (p T (a), j?) z = 0. Similarly (a,p r (/?)) z = 0, and so the tree again 
contributes nothing. 

Case 3. Branch a is not in the tree, but (3 is in the tree. In this case, if (3 is in the mesh 
M tt formed by a and branches of the tree, we have 

(p T a, fJ) z = j _r B 
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where the + sign holds if « and /? occur with the same sign in M g , the - sign if the^ 
occur with opposite sign. For example, in figur e 13.6(a), p T (ct) = ct — y — fi + S and 
(p T (a), p)z= ~ r p- There is now always a unique maximal tree, T', formed by 
replacin g by « in the tree T, for which p T .(p) = + p T (oc\ so that (a, p T \P) )z = + r ~ 
Such a tree T is illustrated in figure 13.6(b). 


a a 



For the pair of trees T and T', Q r r p = Q T r a , since deleting branch /i from T leaves 
the same branches as does deleting branch a from T', and both Q T r p and Q T r a are 
the product of the resistances of these remaining branches. The relationship between 
T and T' is symmetrical if we interchange the roles of a and /?. Thus 

Qt(Pt a 5 fi)z = Qt { a > PtP)z’’ 

and to each Tfor which the left-hand side is non-zero there corresponds a unique T' 
for which the equality holds, and vice versa. Thus, if we sum over all T we get the 
desired result 


YJ2t(Pt%> P)z — PtP)/ 


T T 

since we c a n ignore zero summands. We have thus proved Kirchhoff’s formula 


where 


n = R 1 Y.QtPt 

T 


flr-n'.. r =1Qt 

*4T T 


and the sums extend over all maximal trees. 


As an illustration of 


the network of figure 13.7, 


which has the same topology as figure 13.4. There are three maximal trees, the 


branch « (tree 7^), the branch (T 9 ), and the branch y (TQ. 






_L 



I 



T 


n 


For r 3 , 


Then 


1 0 0 

-10 1 and Qt 2 = 1-3 = 3. 
0 0 1 ; 


1 0 0\ 

0 1 0 and Q Ti —1-2 — 2, 

110 / 


R — Q,Ti + Qt 2 "F Qt 3 — 6 + 3 + 2 — 11. 

Finally, 

n = n [6pn + 3pr 2 + 2pr 3 )]> 
or 

. 5 -6 6\ 

tz — — — 3 8 — 3 —.- 

11 ^ 2 2 9/ 

f A /0^ 

This matrix projects onto the subspace Z l5 spanned by the meshes —1 and 1 ; 

V °/ W 

( n ( A Toy 

you can check that n — 1 = — 1 and n 1 = 1 . The matrix 1 — 7 r projects 


11 
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Its image, which is also the kernel of n, is the one-dimensional subspace spanned by 

/ 6\ _ / A 

the vector 3 . This is indeed orthogonal to both thp meshes —1 and 

(T 



Z 3 = 0 2 0 3 = 6. 

-2 \0 0 3/ \-2! U6j 


I 


with Kirchhoff’s voltage law. 

13.3. Green’s reciprocity theorem 

We conclude our study of resistive networks with a powerful result called Green’s 
reciprocity theorem, which will appear in a similar form when we study capacitive 
networks and electrostatics. To prove this theorem we imagine a network in which 
all but two of the branches contain only resistors, but no voltage or current sources. 
The remaining two branches, « and /?, may conta in vo ltage or current source s in 
addition to resistors. With a specific choice of sources in branches a and ft we will 
obtain a solution with currents I satisfying Kirchhoff s current law and voltages V 


Rn«n iiTifl .CrTTS n ITSi i IPmi/tl rTTtTJ iWS nt)?7?nV3iTau73ii)l' 




different solution, again with currents I and voltages V which satisfy both of 
Kirchhoff’s laws. (A typical situation to which the theorem would apply is shown in 






tmevs 


Both V and since they obey Kirchhoff’s voltage law, lie in the space B 1 . We know 
that any element of# 1 , acting on any element of Z,, gives zero, so we may conclude 





that 



E p’i r = I y% 


branches 


branches 


Now we single out branches a and /?: 

v«l+ v p i« + y =v a L + vn R + 


other 

branches 


other 

branches 


We assumed that all branches except « and /? were simply resistors, so that V y = r y f^ 
and V y = r y I y . Thus 



which is Green’s reciprocity theorem. 
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properties of passive resistive networks, ones which contain no sources, only 
resistors. Given such a network, we can add a new branch in two different ways, as 




resistive network of figure 13.9(a), 


we could connect a new branch a between a pair of existing nodes, as in figure 13.9(b). 
Such a ‘soldering entry’ creates no new nodes. The branch « could be a short circuit 
as shown, or it could include a voltage source, current source, or a resistor. 
Alternatively, we could make a ‘pliers entry’ by cutting an existing wire, creating a 
new node as in figure 13.9(c). 











branch p. We then connect the same battery across p instead, and measure the 



are short circuits), Green’s reciprocity theorem simplifies to 



But we used the same battery, so V p = V a = S. Therefore $ I p = $ T a and I p = f a . 



resulting current is symmetrical. 


Case 2. We insert a current source j across a and measure the resulting voltage 
across P, which is an open circuit. We then connect the same current source into 







and we conclude that 


More generally, we can have a resistive n-port from which n pairs of wires 


protrude, n current sources are connected to tne vanous pui 15 , uic iwum 

voltages V a , V p will depend upon the currents according to some relationship 


where R is an n x n matrix. The above argument shows that the matrix R must be 
symmetric, because, for any two ports A and p, the dependence of V x on is the same 
as the dependence of V* upon / A . 

Green’s reciprocity theorem can also be derived as a consequence of the mesh- 


■•r*i 111 1 tkB i «t ar !■ I till ■ 


have 


so that 


= erJ = —oysLa) *sW. 


Because Z is symmetric and s is the transpose of a, the matrix 


is symmetric, so that if we write 


I = GW 


the matrix G which expresses the branch currents in terms of the imposed battery 
voltages is symmetric. Similarly we can start from the node-potential solution 

0) = ([5] Z - 1 [d]) - 1 [5] (K — Z - 1 W). 


V = - [d]0 = - [d]([a]Z _ 1 [d]) - 1 [a]K. 




R = [d]([d]Z- 1 [d])- 1 [d], 

which expresses the branch voltages in terms of the imposed currents, is a symmetric 
matrix. 

Incidental! 
described by a symmetric matrix. 


A iici I ill ran** 




13.4. Capacitive networks 

A minor modification of the network problem is to have a network of capacitors 
instead of a network of resistors. Each branch is now allowed to have a battery in 










st e ady state no curreni win oe iiowmg, since cnarge cannoi cross Between me plates 
of the capacitors. We are interested in knowing the charge, Q a , on the capacitor in 
branch « and the voltage V a across the branch. A typical capacitive network, and the 
sign conventions for a typical branch a, are illustrated in figure 13.13. Notice that, as 
before, positive V a and W a refer to voltage drops when the branch is traversed in the 
sense defined by the arrow, and tha t positiv e Q a impl ies that the posit ively charged 
plate of the capacitor is encountered first as the branch is traversed. With these 
conventions, if the capacitance in the branch is c a , we have the equation 


V a -W a = 


Qa 


We may regard the vector Q = {Q a , Q p , ...) as a one-chain, and the vector V = 
(V a , V P ,...) T as a one-cochain. However, the pairing between voltage and charge 
now gives energy rather than power. In vector notation, we may write 

V — W = C -1 Q 

where C is the diagonal matrix with entries C a . With the replacement of Z by C ~ 1 
and of power by energy the situation is formally identical to our discussion of 
re sis tive circuits. The voltag e V still may be derived from a potential, s o V = — d <E. 
Since charge cannot flow across the capacitors, the total charge on the capacitors 
connected to any node cannot change. If the capacitors were initially uncharged, this 
total charge must be zero for each node, and <3Q = 0, just as dl = 0 for resistive 
networks. 

Isolated networks of initially uncharged capacitors, for which <3Q = 0, may be 
treated exactly as we have treated networks of resistors. We need only replace I by Q 
a nd Z by C~ 1 in all our previous results. The batteries are still represented by W, and 
there is nothing which plays the role of a current source. Then, by analogy with the 
mesh-current method, we may introduce mesh charges, described by a vector PeZ 1 , 
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Pa = -Q a -Qfs + Q y 


Figure 13.14 

so that any charge distribution satisfying <3Q = 0 may be written as Q = rrP Then 

P = (sC~ 1 < t)- 1 (-s W). 

Alt e rnatively, we can replace Z 1 by C in the node-potential method to obtain 

<D = -([5]C[d])- 1 [a]CW. 

More generally, if the capacitors were initially charged, there may be a fixed 
total charge p A on the plates connected to each node. The vector p = (p A ,p B ,...) 
may be regarded as a zero-chain. Since, according to our sign conventions, the 
negative plate of a capacitor is connected to the node at the end of the branch, we 
write dQ = - pin general. (Refer to figure 13.14 in order to convince y ourself that 
the minus sign is correct.) Since th e sum of the charges on the two plates of each 
capacitor is zero, we know that 

I Pa = 0 

all 

nodes 

when t h e sum is t a ken over all nodes in a network. - 

The equation 3Q = — p is known as Gauss’ law. It expresses the node charges 
in terms of the charges across the capacitors in the branches. Looking ahead to 
our treatment of electrostatics in Chapter 16, we shall see that the analog of the 
node charges will be the ‘charge density’ p, while the analog of Q will be the 
‘dielectric displacement’, D. We will explain in Chapter 16 that D should be thought 
of as a two-form on three dimensional space and describe how one would perform 
the physical experiments to measure it. It will then turn out that Gauss’ law is 
nothing other than the three dimensional version of Stokes’ theorem. In the ‘vector 
calculus’ treatment of electrQStatics as presented in most texts, the dielectric 
displacement D is thought of as a vector field. The Gauss’ law becom es the 
‘divergence theorem’. But in our present context of the theory of capacitive 
networks, the equation 

5Q = — p simply says that we assign to each node the sum of the charges on 
the plates connected to it. 
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Cl 

- - 




For the rest of this chapter we will study capacitive networks with initial charges 
but without voltage sources . So w e are assuming that W — 0, and our equations 
become 

Q = CV, V = — dO and 5Q=-p. 

We can combine these three equations to obtain 

— cTCdO = — p. 

This equation is known as Poisson’s equation. It gives the relation between the 
potential function and the node charges. The operator — dCd occurring on the 


AO = — p. 

Now d maps C°-*C X , C maps C 1 -*C 1 and d maps C 1 -+C 0 . Therefore the 


equation belongs to C 0 , as is to be expected. 

It is instructive to write out explicitly the form of the Laplacian. Let u be an 
element of C° and A a node. If a is a branch with A one of its boundaries, then 


■>o Cdu has coefficient 


+ Cfu(A) — u(B)) at a. 




(Aw)(^4) = £ C a (u{B) - u{A)). 

a:3a= ±(B — A) 


In particular, u satisfies Laplace’s equation 


if and only if 


'X<V 
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emanating from A, as illustrated in figure 13.15. In other words, Laplace’s equation 
says that the value of u at A is the weighted average of its valu es at neare st neighb or 
nodes, the weighting being given by the capacitances. 

This interpretation should be compared with the partial differential equation 

-d 2 w d 2 u 

Au = es + ^=° 


u at a point P is the average of u over a small circle centered at A. For the concept 
of circle to be defined, we had to have the Euclidean geometry of the 


define the nearest neighbor average, we need the matrix C of capacitances. There 
is an analogy between the choice of C and the choice of Euclidean plane geometry. 
We shall pursue this deep point in Chapter 16. 

Let us now return to our study of Poisson’s equation. Notice that the O occurring 
on the left of Poisson’s equation is only determined up to a function which is 
constant on all connected components of the network. Indeed any such function 
lies in the kernel of d, and hence surely lies in the kernel of A = — d Cd. Also the 


B 0 of C 0 since, by definition, the node charges are determined by applying d to 

O 11 fLnl' tir/x + 11 f i D 0 0 f 0 n rl 


the restricted coboundary operator [d]:P°-^C 1 , and we also introduced the 
restricted boundary operator [d]: C^-^Po- The situation is best summarized by 
the following diagram of maps: 


we mtroc 


restrictet 


consider the restricted Poisson equation 

^ [A]T= -p 

for TeP°. (In other words, is a ‘potential determined up to additive constant 
on each connected component’.) Now since the operator C is given by a positive 
diagonal matrix in terms of the standard basis of C 1 and C x we know from the 


solution oi tne restricted roisson equation is given by 

'p=([a]c[d]- 1 P . 


ie me IO( ° ort logonal projection 


Illustrative example 

A simple Poisson equation problem is presented in figure 13.16. Four units of 

is at node C, and we wi s h to find the potential at 
all nodes. (If you want to specify units, use microfarads for capacitance. 



We choose A as the ground node. The restricted boundary operator is then 


[ 5 ] = 


A -1 0 s 

0 1 - 1 , 


(remember, the row ! 


-TO— 


1 o 


0 -1 


The matrix C is 0 2 0 , so 


1 -1 0 s 


T 0 0 / 1 0 


0 0 3 


0 -1 


>n to Poisson s equation is 

(racrar*.lg 2 \ X 


The Poisson equation problem that we just solved is the analogue of the electrostatic 


distribution p(r) in vacuum, with no boundary conditions specified. More interest¬ 
ing electrostatic problems are ones in which the boundary condition of zero 



ouru ary-va ue pro ~> em. 



the boundary of a region. We shall now consider the discrete analogues of such 
problems. _ 

We imagine a connected network of capacitors, with no batteries along any of the 
branches, as shown in figure 13.17. We subdivide the nodes of the network into two 
classes; boundary nodes and interior nodes. Boundary nodes, such as A and B, are 
connected to external sources which maintain their potentials at specified 
values. Since ch arg e c an flo w freely alon g the wi res connecting these node s to the 
external sources, it will not be reasonable to specify the total charge at a boundary 
node. Interior nodes, such as C and £). are connected only to other nodes of the 
network, not to any external sources. The total charge at one of these nodes is 
constant and may be specified; it is unreasonable, though, to specify the potential at 
interior nodes. 

In the general problem concerning such a network, we would specify the potential 
at each boundary node and the total charge at each interior node arbitrarily, then try 
to determine the potential at each interior node and the charge at each boundary 
node. In fact, such a general problem can be expressed as a superposition of two 
more restricted problems: 

(1) A Dirichlet problem, in which the total charge at each interior node is set 
equal to zero and the potential at each boundary node has a specified 
(generally non-zero) value. 

(2) A Poisson equation problem, in which the potential at each boundary node is 
set equal to zero and the total charge p at each interior node has a specified 
value. The problem posed in figure 13.15 was of this type, with node A 
functioning as a boundary node. 

The solution to the general problem can always be expressed as the superposition 
of the solution to a Dirichlet problem with the appropriate boundary conditions and 
a Poisson problem with the appropriat e charges at interior nod e s. 

Corresponding to our decomposition of nodes into two classes, we get a 
decomposition of the vector spaces C 0 (node charges! and C° (potentials). The space 


( 


( rojec 


C 0 is the direct sum 

C 0 = Co° und © Cj, nt , 

where Co ° und consists of zer o-chains having non-zero coefficients only at the 
boundary nodes and Q 111 consists of those zero-chains which have non-zero 
coefficients only at the interior nodes. Similarly, we have 

C° = C« ound ©C? t , 

where C b oun d consists of those linear functions which vanish on Q nt „ while C? t 
consists of those functions which vanish on Co° und . 

The potential O may be decomposed in a unique way as the sum of an element of 
each of Cl _,, and C? nt . We write 

® = Abound + <&i„f 


For the circuit of figure 13.17, with boundary nodes A and B and interior nodes C 
and D, this decomposition reads simply 


<x> B 


/ \ 

O' 4 

0 

w 

+ 

' o ' 

0 





VP 

w 



Similarly, the vector representing node charges may be dec ompos ed as 


p = p bound + p int 


so that, in the circuit of figure 13.17 



In terms of these vector space decompositions, we can formulate our problems as 


follows: 

(1) Dirichlet problem: Since the potential is specified at each boundary node, 
Abound is a known vector. Furthermore, since there is no charge at any interior node, 
the vector p mt = 0, We must determine the vector O i n t so that 


A(^i„t + Abound) = 0 

at all interior nodes. At the boundary nodes, we can calculate 

A(0> int + <D bound ) = -p bound 

and will then know the potential and charge at all nodes. 

(2) Poisson equation problem; Since the charge is specified at each interior node , 
p int is a known vector. Furthermore, since the potential is specified as zero on each 
boundary node, O hminH = 0. We must determine the vector <fr in , so that 


A(®bound + ®lnt)= ~P' m 





At any interior node, we can find a <D lnt which does not vanish at that node, and 



follows from (13.1) that 



As there are n b — 1 boundary nodes left over if we exclude ground, we see that we can 
rhitrarilv assign the valu e of (b at the remaining boundary nodes. Thus we . ran 


always solve the Dirichlet problem. 

As an explicit example of the decomposition (13.1). consider the network of figure 
13.18. Here A, B, and C are boundary nodes; D is an interior node. Since there ar e 
four branches, dim C 1 = 4. The orthogonal subspaces are as follows. 


(1) C 1 Z 1 is a one-dimensional subspace with basis 


1 

2 
1 


V 1 / 


. This basis element 



nq 


f A 


corresponds to a charge distribution, C 

1 

2 

_1_ 

— 

i 

, for which p- 0 at all 









unique decomposition 

t = V + W 

where V is an element of dC? nt and W is orthogonal to dCf nt . Being an element of 






o v, it ties m 

interior nodes. Thus — dCY = — dCW = o at interior nodes, and V is the solution to 
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express the solution as 


Similarly, we may solve Dirichlet’s problem by orthogonal projection. Let O 
denote a potential which has the specified values at boundary nodes and which is 


_ V = 71V + (1 - 7C)V. _ 

lay write riV = — di jt, where \f/ = 0 at all boundary nodes. 





write 


(1 ~n)P= -dd>, 

we know, since (1 - 1 z)PeD\ that Ad> = 0 at all interior nodes. Furthermore, since 

-d<&= —di// — d < t> 

and ijj = 0 at all boundary nodes, <1> = $ on the boundary. Thus <I> is the desired 
solution to Dirichlet’s problem. 


Examples 

Both of these methods may be illustrated for the network of figure 13.19. For that 
network, a basis for dC° nt was 


TJX 


u = 

X 

-1 

5 


1 i 



X=r.1 



a vector whose (length) 2 (relative to the scalar product defined by C), is 1 + 2 + 1 = 
4. Then, for any VeC 1 , 




Applying n to the basis vectors 


/ 1 \ 

0 


fo\ 

1 


0 

Vo ) 


0 

Vo J 


we find the matrices 



( 1 

-1 

—a 




-2 0 

2 0 

0 0 

2 0 



1 

0 


and 


v 


(i 




V 


2 0 
2 0 
0 4 
-2 0 


-1 

0 




and we again obtain the correct solution. _ 

In constructing V, it is simplest, whenever possible, to assign all the charge on each 
interior node to a single branch which connects that branch to a boundary node. In 
com plic ated netwo rks where there are interi or no des connected on l y t o other 
interior nodes, it is advisable to begin with such nodes and work from the inside out. 
For example, in figure 13.20, where A and E are boundary nodes, it is best to begin 














would have been obtained with any other arbitrarily chosen value for 6 D . 

13.7. Green’s functions 

The map which assigns to the charge distribution p the corresponding potential u 
which solves the Poisson equation is called the Green’s operator. The matrix entries 


by G(A, B), where A and B are nodes. We use the notation G(A, B) instead of the 
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G(A, B ) = potential at B due to 

solving the Poisson equation 
with unit charge at A. 


The function G that assigns t o the n odes A and B the matri x e lem ent G( A. B) is 
called the Green’s function. So G is a function of two variables. When considered as a 
function of the second variable, with the first variable fixed, it is an element of C? nt . 
We let A 2 denote the operator A applied to the element G(A, •) of Cf nt . Thus 

— A 2 G(A,-) = S a 

G{A , B) = 0 for B on the boundary. 

To repeat, 

A: CL-Cjf. 

The inverse of — A is G so 


Becau 




an wn 


u(B) = Y,P(A)G(A, B) for all B 

where the sum extends over all interior A. 

We can also use the Green’s function to solve the Dirichlet problem. For this 






s 












where (,) denotes the symmetric scalar product on C 1 . Thus 


Y u{A)Av{A) = Y &u{A)v(A). 

all A a UA 

If we split the sum up into two parts, one over the boundary and the other over 
the interior, we get Green’s formula: 



V A\Ad( A\ — A\v( A}\ — _X’ (u(R\Aii(R\ Au(R\ii(R\\l\ 'X 



interior A boundary B 



Let us take two interior points A x and A 2 , and set u — G(A X ,-) and v = G(A 2 ,-) in 
Green’s formula; the right-hand side in (13.2) vanishes since G{A U -) and 
G(A 2 ,-) vanish on the boundary. At interior points u(T) = 0 if A^A X and 
^ ) = l. Similarly for v and A 2 . Thus the vanishing of the left-hand side (13.2) 

says that 

G(A x> A 2 ) = G(A 2 , A x ). 

If we write u = O, Au = — p and v = <$>, Av = — p, (13.2) becomes 


I (Pc$ C - Pc® C ) = Z (/V*> B - Pb$ b )- 03.3) 

interior houndary 

nodes C nodes B 


We see that Green’s formula is just Green’s reciprocity theorem for capacitive 

netw o rks .__ 

We now suppose that O is a solution to the Dirichlet problem with specified 
values on the boundary, so /> = 0 in the interior. Take d> = G(A, •) in (13.3). Then we 
have 

y, ( — AG(A,C)<t£ — p c G(A,C))= y, ( p„G(A,B) + AG(A,B )& B ). 

interior boundary 

nodes C nodes B 

Recall now that — A Gb4. C) = 1 if C = A and 0 otherwise, because G(A , C) is the 
solution to Poisson’s equation with one unit of charge at A, zero at other interior 
nodes. Furthermore, we have assumed that p c = 0 at interior nodes. So only one 
term, (jg, survives on the left-hand side of the equation. On the right, G(A,B) = 07 
so we have finally 

= Z AG(A,B)$ b . 

boundary 

nodes 

That is, d*' 4 , at an interior node A, is a weighted average of the values of 
at boundary nodes, with the weighting factor A G(A, B) being the quantity of negative 
charge which is induced at boundary node B when unit positive charge is placed 
at A with the entire boundary held at zero potential. The matrix (AG(A, B)), which 
represents a linear transformation from C£ ound to C? nt , is known as the Poisson 
kernel. — 
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It is a simple matter to apply the Green’s function approach to the network of 
igure 13.18. Since there is only one interior node, D, we need only to solve Poisson’s 
equation for the case p D = 1. We have already done this, obtaining voltages 



( ^ 


7^ 

-1 

_ 

v = — 

4 

0 

7i) 



Therefore, with A, B, and C grounded, the potential at D is £, and G(D, D) = i 
The charges at the boundary nodes under the circumstances are p A = — i, 
Pb ~~ Pc = ~ so we have AG(P, A) = j, AG(D, B) = AG(D, C) = Armed 
with this Green’s function, we can easily solve Dirichlet’s problem for the network. If 

p D = 0, then is given by a weighted average of the potentials on the boundary, 
with AG supplying the weights: 

<b° = A G(D, A)$> A + AG(D, B]Q> b + A G(D, C)O c 

or 


T> = ±(t> A + 

There is another version of Green’s formula (called Green’s second formula) 
^whichTs sometimes useful. Let us now look at a boundary point, A. In the formulaT 

Au(A)= £ CJu(A) - u(B)) 


awith±5a = (B — A) 


let us divide the sum on the right into two parts according to whether the branch 
a joins two points on the boundary ’ ’ ’ 1 * ' ‘ ‘ ’ 

point. Thus 


AU[A) 




a with ± d* = B-A 
a an mterior point 


cj ma) - m m 


-I_ C MA) - u( B)). 

a with ±da = B — A 
B a boundary point 


Let us denote the second sum by A bound u(A). We can think of the boundary, 
together with the branches of the network which join two boundary points, as 
forming a network in its own right. Then A bound would be the Laplace operator 
on this subnetwork, determined by the capacitances of these boundary-to-boundary 
branches. Therefore, 


and so 


£ ( u(A)A bo ' inA v(A ) - A bound u(,4)i;(v4)) = 0 

boundary A 


. 2 (u(A)Av(A) - Au(A)v{A)) 

a in boundary - 

= I- (u(A)CMA) - I>m - CMA) - U(B))V(A)). 

in boundary 
±(B-A) 

° in intenor 
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Xhe terms involving u{A)v(A) on the right cancel and the right-hand side simplifies to 

-X c. a (u(B)v(A) - u(A)v(B)). 

_ A in boundary _ 

da=±(B-A) 

B in interior 

Substituting into Green’s formula, we get Greens second formula: 



~~ X (u(A)Av(A)~ Au(A)v(A))= X c x (u{A)v(B) - u{B)v(A)). 



da= ±| B-A) 

B in interior 



13.8. The Poisson kernel and random walk 

We continue our study of the Dirichlet problem. Let Q — AG be the Poisson kernel. 
Thus, if 0eCbound gives the specified boundary potential, then u, given by 

u(A)= -£- Q(A, B)m, 

- B on boundary-- 

gives the solution of the Dirichlet problem with boundary values </>. In the preceding 
section w e saw how to find G, and hence Q, by th e method of orthogonal projection. 
In this section we shall give a probabilistic construction of Q. While this method 
is not usually a menable to direct computation, it does give new insi ghts into the 
problem and also provides an iterative scheme for computing Q. 

We first recall exactly what the Dirichlet problem says. Let A be any interior 
node. Let B be a nearest neighbor node to A. That is, B is a node at the other 
end of some branch /? with end point ± A. {B can be interior or boundary.) Let 

^b,a = Cp( X c ^] ■ 

\da= ±(E-A) / 

Then the condition - 

(Au)(A) = 0 

is the same as 

u(A) = Y j u(B)P b>a 

summed over a ll nearest neighbors. Notice that by construction 

P A .B> 0 , ZP B .A= 1 - 

B 

Let us set 

P A A = 1 . P r A = 0 . A’*A _ 

at all boundary nodes. Then let P denote the n x n matrix whose entries are P B A . 

_ fix the notation, l et us write the boundary nodes first, so that upper left-hand 

corner of p w in be a k x k identity matrix, where k is the number of boundary 
nodes. The column corresponding to an interior node will have P B A at the B'ih 
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node, if B is a nearest neighbor of A, and 0 otherwise. The condition 


_ u — uP 

if we think of u = (u(B l ),u(B 2 ),...) as a row vector. If the first k components of u 
are given by 4>, then so are the first k components of uP , since the first k columns 


iiBi iTiM'ii •TAl'il 


Ski iTSIIWj ctuto ST3?3I H 




be formulated as 

-ujP-—M-andthefirst-fcentries-of-uaregiven-by^.- 

On the other hand, the matrix P has the structure of a matrix of transition 


«r* T—zjm HMiiai ■riiviBt T/iBivri 


iKiiirMiaBMinrMaBL'KNiiikaririi 


■ T_■ ■r«rt 



■ j i ivaf < ■ iir« iiiiii j i ■■■ [«■ i ■■■«] ■ ivj iiii ■■ ■ ii 


• J ■K'l IV] •S'B III* 


the probability of remaining in the interior goes to zero. Thus, in the limit, the 
matrix P N tends to a matrix of the form 


""Vo or 

where / is the k x k i dentity mat rix and Q is a matrix with k rows and n — k 
columns, where n — k is the number of interior nodes. The entry Q(A, B ) represents 


up at the particular boundary point B. (It must eventually end at some boundary 
point.) Now P N -» H and hence P v +1 == P N P ^ H and thus HP — H. Therefore, if 
v is any potential, \H satisfies 


(\H)P = \H 


tence 


u = \H 

is a solution of the D irichlet pr oblem, whose 
the first k entries in v. If we choose v to be a vector whose first k components are 
given by 6, i.e., if we choose v of the form 




v = ( f , w) w any n~ k row vector. 
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w e see that, at any interior point A, we have 

u(A) = (vH)(A]J= y <j>(A)Q(A,B) 

B in boundary 

s o that 0 is indeed the Poisson kernel. 

The form of H a s the l imit of P N can be given an interesting geometrical interpre- 
tation. The entry P N {A 2 ,A l ) is the probability of getting from A y to A 2 in N steps. 
This probability is the sum of the probabilities of all iV-step paths leading from 
to A 2 . (If A 2 is a boundary node, then some of these paths may really be of 
smaller length - that is, the last r ‘steps’ in the path may consist of the particle 
standing still at A 2 .) The probability of any path is simply the product of the P BA 
over all the branches a traversed in the path. If A is an interior point and B is a 
boundary point, it is clear that as N increases, we are simply adding longer and 
longer paths in the computation of P N (B, A). Thus 

Q{B,A)= y j (probability of path). 

all paths joining A to B 


13.9. Green’s reciprocity theorem in electrostatics 

As a slight generalization of capacitive networks, we may consider a system of 
ch a rged conductors, each of whi ch has a w ell-defined charge p and a we ll-defined 
potential (p. To describe the analogue of a branch for this sort of system requires 
the introduction of the dielectric displacement field and must be deferred until 
Chapter 16, where we consider electrostatics. The description in terms of charges 
and potentials, however, has much in common with capacitive networks. 



Figure 13.21 

The total charges on each of the various conductors may be described in terms 




i 

of a vector p = I 

& 

I in a space we may call C 0 , while the potentials form a 


vector (j> — ((p A , 4> B ~ tt) in its dual space C°. As with capacitive networks, the 
evaluation of an element of C° on an element of C 0 gives the electrostatic energy 


of the system of conductors: to be precise, the energ y is ^ p < I> = j T.Pa < P A - 






< ) 


or xogonal projection 


(In Chapter 16 we will see that this expression will be equal to a certain integral 

(dcf), d(f>)dx d y dz 

% 

over the space between the conductors, where d> is the electrostatic potential. For 
now, we must con tent ourselves with a description of what happens at the 
conductors.) 

The total conductor charges p may be expressed in terms of the potentials <f> by 
a Laplace operator A, so that 


P= 

The operator A depends upon th e shap e s of th e various conductors, on thei r 
distribution in space, and on fundamental constants of electrostatics. Except in 
certain cases involving parallel planes or concentric spheres or cylinders, calculation 
of A is extremely difficult. However, as we shall learn in Chapter 16, a Green’s 
reciprocity theorem holds for systems of conductors just as it does for capacitive 


networks. To be specific, if p and </> denote one possible distribution of total charges 
and potentials for a system of conductors, while p' and (j)' denote another possible 




[* = 

/% 

4L 


t j 

p 


It follows immediately from this theorem that A is a self-adjoint operator, 
represented by a symmetric matrix. To see this, let 


Then 


p=— A(f>, p'=—A<f>'. 


ie., 


0 = 




— A ij> 


(A<^\ (f>) = 

from which it follows that A is self-adjoint. In the physics literature, — A is usually 
called the matrix of capacitance coefficients. 







J 


i y leorem in e ectros atics 


The inverse of the matrix — A is called the matrix of potential coefficients. In 
certain simple cases we can write this matrix down in terms of well-known formulas 
of elementary electrostatics. Consider, for example, a system of two concentric 
spheres, with radii r A and r B . For a sphere of radius r 0 bearing unit positive charge, 
the electric potential, expressed in Gaussian unit sHs 




lAo ( r < r o)> 

1/r (r > r 0 ). 


(See, for example, Purcell, Electricity and Magnetism, section 1.11.) For an alterna¬ 
tive treatment, see later in sections 16.1 and 16.2 where we derive this result from 
first principles. 

jf Pa = 1, p B = 0, the potentials are therefore (f) A = \/r A , <f) B = 1 /r B , while, if p A = 0, 
p B = Vboth potentials are equal: (f) A = (f) B = 1 /r B . It follows that 


— A -1 = 



Vrsf 


This matrix permits us to calculate the potential of the two spheres for an arbitrary 
charge distribution. Its inverse gives the Laplace operator: 


ii_iVY 1/r * 

r B r A r \) V-l /r B 


-1M 
1 /rj 


or 


1 

> 

II 

to 

r i -n 

[_ i"b | 

( f A 


r B ~r A ' 

V - t r B /r A J 

r B - r A ' 




This matrix determines the charges on the two spheres for specified potentials. For 
the case (p A = 1, cf) B = 0 it gives 


Pa 

Pb 


r A r B ( Y 

rB-r A \-l)' 


i.e., there are equal and opposite charges of magnitude r A r B /(r B — r A ) on the two 
spheres. This quantity r A r B /(r B — r A ) is called the capacitance of the pair of spheres. 



The matrix — A 1 is equally easy to write down for any number of concentric 
sph e r e s. For example, with thre e spheres as shown in figure 13-23 it is 


- A = | 

( l / r A 1 /r B l/r c \ 

_ \ fy n _1 /y _1 fy ' 

I_ 


vV^c l Av 1 ! r c) 

I 


T ,et ns now, as for capacitive networks, imagine that some of the conductors are 
boundary conductors whose potential may b e e stablished by connecting them to 
batteries, while others are interior conductors whose charge may be specified. In a 
Poisson equa t ion problem, we are given 0 ' = 0 on all boundary conductors, w hil e p' is 
specified on all interior conductors, and we are required to determine 0' for the 
interior conductors, p' for the boundary conductors. In a Dirichlet problem, on the 
other hand, we are given p = 0 on interior conductors, while 0 is specified on all 
boundary conductors, and we must determine 0 for interior conductors, p for 
boundary conductors. 

The two types of problems are related by Green’s reciprocity theorem. In general 

Z P'aV+ Z P'b^ B = Z PaV A + Z Pb4>' B - - 

boundary interior boundary interior 

But given that 0' = 0 on the boundary (Poisson) and p — 0 in the interior (Dirichlet), 
we get 


-- Z M B = Z p'a 4> a - 

interior boundary 

Suppose we denote by Q{C, A) the charge p' A on boundary conductor A for the 
Poisson equation problem in which unit charge is placed on interior conductor 
C, while the charge is zero for all other interior conductors. Then only one term 
survives on the left-hand side, and we have 

This is the Green’s function solution to the Dirichlet problem. 





































A B C 


Figure 13.24 





In practice, it is frequently easy, for systems of conductors, to solve Dirichlet’s 
probl e m, by inspection, then to use Green’s reciprocity theorem to solve Poisson’s 
equation. Consider, for example, the system of three large parallel conducting 
planes shown in figure 13.24. We regard A and C as boundary conductors, B as 
an inter ior conductor. Since potential is a linear function of position between the 
planes for this geometry, it is apparent that 

( f) B = <f) A + - ^A) 

or 

<t> B =U A + ^ c - 

This is a solution to Dirichlet’s problem. 

Now consider Poisson’s equation, with charge p' B on the middle plane, 
ff, ,A = (j)' c = 0. By the reciprocity theorem, 

p' B <t> B = - M A - p'c4> c • 

gut (f) B = | (f) A + for an arbitrary Dirichlet problem, so 

p' B W + = ~ PaV ~ p'c<t> c - 

It follows that 

Pa— — IPs? Pc = — 3 Pb 

so that we have learned how the induced charge is distributed between the 
two boundary planes. 

More generally, Green’s reciprocity theorem may be applied to problems involv- 
ing both conductors and specified distributions of charge in the region bounded 
by the s e conductors. For exampl e , if a charge + q is plac e d n e ar an infinit e ly larg e 
conducting plane, the electric potential everywhere on the same side of the plane 
of the charge may be determined by the method of images. 



Q 


see section 16.1. From this potential function it is possible to determine the distribu- 
tion of negative charge over the conducting plane. This in turn makes possible 
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the solution of the Dirichlet problem in which the potential is specified everywhere 
on one plane and it is r e quired to determine the potential throughout all space. 
In a similar manner, one can construct the Green’s function for the Poisson 
equation problem of a point charge inside a grounded conducting sphere and use 
it so solve the Dirichlet problem in which the poten tial is specified on the surface 
of a sphere, and it is required to determine the potential within the charge-free 
region bounded by the sphere. 


Summary 

A Orthogonal projection 

You should be able to state the properties of the self-adjoint projection operator n 
for resistive networks and to use n to solve network problems. 

You should be able to construct k by using the Gram-Schmidt process or by 
Kirchhoff’s method. 

B Capacitive networks 

You should be able to solve capacitive network problems by using Maxwell’s 
methods or by inverting the Laplace operator. 

You should be able to solve Poisson’s equation or the Dirichlet problem for a 
capacitive network by means of orthogonal projection and to describe the 
associated decompositio n of C 1 i nto orthogonal s ubspac es. 

C Green’s functions 

Vou should know how to construct the Green’s function for a capacitive network 
and to use it in solving Poisson’s equation or the Dirichlet problem. 


Exercises 


13.1. Consider the network shown in figure 13.26. 

(a) Write down a basis for Z t . Explain the meaning of the statement ‘ft 
represents a projection from C l o nto the s ubspace Z 1 ’. By carrying out 



Figure 13.26 




appropriate computations verify that the matrix 



represents a projection from C x onto Z : . 

(b) Use 7t to determine the branch currents in the above network. Make a 
sk e tch which shows th e imag e and kernel of n and indicates how it 
solves the problem. 

(c) What linear transformation is represented by 7t T , the transpose of nl 
Name, and characterize in physical terms, the space which is the kernel 
of this transformation. Support your answer either by invoking a 
general theorem or by citing appropriate properties of the matrix n T . 





to an orthonormal basis {E„ E 2 } for Z t . By applying rcl = (E^I^Ei 
+ (E 2 > I)zE 2 in turn to each of the four basis vectors of C t , construct 
the matrix representing n, and use it to solve the above network. 

(b) Verify the formula n = o{sZa)~ x sZ for the above network. 

(c) Construct the matrix Z(1 — n), and show that its image is the space B 1 . 


(22 28 24 4\ 


1 

14 — 12 -? 


Answer: % = •— 

CA 

Q_ Q % 6 

- 

jU 

O O JO o 



\ 8 -8 36 6/ 


13.4. ( a ) For the netw ork of Exercise 13.3, construct a matri x to re present 

r.C'-tC 1 

which projects C 1 onto B 1 , orthogonally with respect to the s c alar product 
d e fin e d by Z" 1 , i. e ., (V,V') z - i = Jz-ty V. To construct a basis for B l , 
choose the potential to be 0 at A (ground) and — 1 in turn at B and at C. 
(b) Show that V = t(W — ZK), and use this result to solve the network of 
Exercise 13.3. 

13.5. (a) Find an explicit expression for the matrix t of Exercise 13.4 in terms of fd~l, 

Z~ and [d]. Use it to show in general that r 2 = t. that the image of t is B 1 , 
and that t is self-adjoint with respect to the scalar product defined by Z ~ 1 
(b) Show that for any network, the matrix 

Z~ 1 [d] ([5]Z ~ 1 [d]) - 1 [0] + <t(sZ<t)~ x sZ 

which represents a linear transformation from C 1 to C lt is the identity 
matrix; 

1X6. “Apply Kirchhoff’^ methbd to Exercise 1373, as follows: 

(a) There are five different maximal trees. Find the projection operator 
_ p T (a 4 x 4 matrix) for each tree. Note that each pair of branches except 

{a, /?} defines a tree. (Watch the signs - all diagonal entries in p T must 
be ^0.) 

(b) Compute Q r for each tree and find R = Y,Qt- (We get R = 50, the same 
as Det(sZcr). Is that true in general?) 

(c) Use Kirchhoff’s formula 


71 — 1 XI QtPt 

£\ t 

to obtain the projection operator n of the Weyl method. 

13.7. Invent a variation of Kirchhoff’s method which constructs the self-adjoint 
proj e ction op e rator r: C 1 -> C 1 , with the property that V = t(W — ZK), as 
a weighted average of projections from C 1 onto Z 1 which are not self- 
adjoint. One way to proceed is to start from Kirchhoff’s method in C l and 
multiply both sides by Z. 

13.8. Prove the formula n = <j(sZ<j)~ 1 sZ for the Weyl projection operator by 
verifying that n has all the required properties: 

(a) n is a projection: n 2 = n. 

(b) The image of n is Z v _ 

(c) n is self-adjoint relative to ( , ) z ; i.e., (nl, l') z = (I, nT) z . 

13.9.(a) Construct the matrix n which represents the orthogonal projection of 

onto the subspace Z x for the network shown in figure 13.29. Use this 
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(They are all integers.) 



snort circuit across ii, msteaa, you connect tne same cattery 
across CD, what current flows in a short circuit across AB1 
(b) If you connect a 25 ampere current source across AB, what voltage is 
developed across CD (left open-circuited)? If, instead, you connect the 
same current source across CD, what voltage is developed across AB ? 




Zn. Use it to determine the branch voltages V if an 8 ampere current 
source is inserted in branch a. Also determine V for the case of an 8 ampere 
current source in branch y. 

(b) For the same network, construct the symmetric matrix nZ~ 1 . Use it to 
determine the branch currents I for the case of a 16 volt battery inserted in 
branch « or as in branch y. 














13.13. In the network shown in figure 13.33, p B = 13 and p c = -l(so p A = - 12 ). 

olve Poisson’ 
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13.15. Consider the network of capacitors shown in figure 13.35. Nodes A and B 
are boundary nodes; node C is an interior node. 


(b) The vector W = ^ corresponds to a situation in which there are 

four units of charge at node C. Use n to determine how this charge will 
distribute itself if nodes A and B are both grounded. 


(c) Suppose the potential at A is cp^ = i 


is <E> g = 4. anc 
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Figure 13.36 


(a) Write down the matrices 8 and d for this network. 


| I 

Ei 

■ ■ 

£5 

I 

1 

miijB T 


mm 
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defined by the capacitance matrix C. 

(c) Let n represent the orthogonal projection from C 1 onto the subspace 






spanned by d(p t and dEvaluate n 

i° 



(d) Suppose that there is one unit of charge at node B, no charge at node 



C, and nodes A and D are grounded. Use orthogonal projection to 
calculate the potential at B and at C. Find the charge at A and D. 

(e) Suppose that O a = 5,<5> D = 0, and there is no charge at nodes B and C. 
Use the method of orthogonal projection to determine the branch 
voltages V. 

(f) For the problem posed in (e) show how to determine <1> B by using the 
Green’s function of part (d). 



Figure 13.37 





13.17. Consider the network of capacitors shown in figure 13.37. Nodes A and B 
are boundary nodes: nodes C and D are interior nodes. 

(a) Consider the vectors 


ti = 


f°\ 
0 


and 


0 




[ 0 \ 

0 

2 


Ui 


both elements of the space C°. Name, and characterize in physical 
terms, the subspace which they span. Determine dt/^ and d\j/ 2 , and 
verify that they form an orthogonal basis for a subspace of C 1 , relative 
to the scalar product defined by C. 

(b) Let n denote the orthogonal projection of C 1 onto the subspace for 
which you have just found a basis. Characterize in physical terms the 


vectors which lie in the kernel of n. Determine it 

0 



lii 



(c) Suppose there is 1 unit of positive charge at D. none at C. and nodes A 
and B are grounded . Usin g the result of part (h) , find all the branch 
voltages and charges. 

(d) Suppose there are 2 units of charge at D, no charge at C, <I> A = 4 and 


determine the potential at D. 

(Hint: The problem is a superposition of Poisson and Dirichlet.) 
13.18. Consider the network in figure 13.38. Regard A, B, and C as boundary 


C 



Fi gure 13.38 


nodes, D and E as interior nodes. 

(a) Find a basis for the subspace Cf nl c C° and for the subspace 
dC? nt c C 1 . Verify that the matrix 



projects C 1 onto this subspace. 

(b) Construct l — n, and verify that its image is orthogonal to the image 
ofrc: 
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(c) Verify that H 1 (the image of Z t under the action of C ~ 1 ) is in the kernel 

of 71. 

(d) Construct the matrix (1 — 7t)d, which represents a transformation 
from C° to C 1 . Verify explicitly that if is any basis vector in C£ ound 
then V— — (1 — 7i)dO describes a set of branch voltages for which 
dCV = 0 at interior nodes D and E. 

13.19. For the network of exercise 13.18 use the matrix (1 — n)d to solve 

Dirichlet’s problem for potentials ® A = 3, d> B = — 5, O c = 0, with p D = p E 

= 0. Determine p A , p B and p c . 

13.20. In the preceding exercise, solve the Poisson equation for the case where 

Pr= 1, Pn = 0, <S> A = Q> B = ® C = 0, 

as follows: 

(a) Choose any Q compatible with the conditions p D = 0, p E = 1. Now 
form X = C~ l Q, the associated branch voltages. 

(b) Calculate V = ttV, and determine a potential such that d> = 0 on the 
boundary and — dd> = V. 

(c) Repeat steps (a) and (b) with a different Q. You should get the same 
answer. 

(d) Solve Poisson’s equation for the case p D = 1, p E = 0. You have now 


Poisson’s equation for p D = 10, p E = 7. 

(e) Using the Green’s function, construct the Poisson kernel AG(A, B). 
This is a matrix with three columns and two rows which represents a 
mapping of 





( p A 

| into — 

I PA \ 


\Pe) 


\Ptr) 



(f) Check that the Poisson kernel gives the correct solution to the 
Dirichlet problem posed in Exercise 13.18(c). 

13.21. A capacitor consists of three long coaxial cylinders, each of length /, whose 
radii are r A — a, r B — 2a, r c = Sa. 

(a) Solve the Dirichlet problem in which cylinder B is uncharged and the 
potential difference between cylinders A and C is <J>°. 

(b) Using Green’s reciprocity theorem, solve the Poisson equation 
problem in which cylinders A and C are grounded and charge Q is 
placed on cylinder B. 

13.22. Three concentric conducting sphe res have radii r A — 2, r B = 3 , r c — 4 
r e sp e ctiv e ly. 

(a) Write down the matrix —A -1 for this system, and invert it to 
construct the Laplace operator — A. 

(b) Suppose the potentials are (f) A = 0, (f) B = 1, (j) c = 0. Determine the 
charge on each sphere. 
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13.23. Four large parallel conducting planes, each of area A 0 , are separated by a 
distance /. 


(a) Solve by inspection the Dirichlet problem in which <j) A and <j> D are 



problem in which planes A and D are grounded while charges Q u and 
Q c are placed on the interior planes. Determine the potentials (j) B and 
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Summary 526 

Exercises 526 


In Chapter 14 we conclude our introduction to algebraic 
topology by ske t ching how the one-dimensional r esults of 
Chapters 12 and 13 generalize to higher dimensions 


14.1. Complexes and homology 


We now interrupt our study of electrical circuits in order to study some extensions of 
the mathematical methods of the preceding two sections. We have defined a one- 


one-dimensional objects called branches, with the branches joined together at the 
nodes. Of particular interest in the study of such a complex are the notion of 
connectivity, which reflects itself in the space H 0 , and the number of independent- 
meshes, which equals the dimension of the space H 1 . We now want to generalize to 
higher dimensions by considering complexes which include elements of two, three, 
or more dimensions. A two-dimensional complex or two-complex will include, in 
addition to nodes and branches, two-dimensional elements (bits of surface), which 
we shall call two-cells or faces. As indicated in figure 14.1, these faces can attach to 


D 



Figure 14.1 




one another along the branches. The boundary of a face will consist of a collection of 
branches, each with a plus or minus sign. Similarly, a three-dimensional complex 
will include three-dimensional elements (bits of volume), called three-cells, which 
can attach to one another along faces. The boundary of a three-cell will consist of a 




(Mill 


S 0 ? S 1 , S 2 , ■ ■ ■, S n , related by some conditions which we shall describe below after we 
h ?ve introduced some notation. The set S n will consist of the zero-dimensional 
elements, or nodes, of the complex, the set S x of the one-dimensional elements, or 
branches, the set S 2 of the two-dimensional elements, or faces, and so on. Since we do 
not have enough alphabets handy to assign a different alphabet to each dimensionf 
we shall sometimes denote the elements of S 0 by O^O^O^..., the elements of S x 


L1? * 2 > *3> ' 


‘ 1 ? * 2 ? * 3 ? ' 


shall continue in examples to use Latin letters to denote elements of S 0 , Greek 
letters to denote elements of S',, and names (e.g., Red, Bill, Canada) to denote 


elements of S 2 ,S 3 , etc. 

For each set S k we now introduce the vector space of k-chains, C k , consisting of 
vectors whose components are indexed by the elements of S fr . The spaces C n and C v 
are already familiar from our study of electric networks. In general, the dimension of 
the space C k equals the number of elements of S k . As before, we shall identify an 
ele ment of S k wit h the vector which is one in the position corresponding to th at 
element, zero in all other positions. Thus, for example, in the complex of figure 14.1, 
we have vectors in C 0 :0 X — A = (1,0,0,.~.) T , 0 2 = B = (0,1,0,...) T , etc.; vectors in 
G\: lj = a = (1,0,0,.. .) T , 1 2 = /? = (0,1,0,.. .) T , etc.: vectors in C 2 : 2 } = Red = 
(1,0) T and 2 2 = Blue = (0,1) T . 


actually a sequence of boundary operators, which we shall denote all by the same 

1 I 7 ,1 * . 1 


boundary: this defines d on each basis element of C k and hence determines a linear 
map of C k into C k _ t . (For k = 0 we take <3 = 0, since there are no elements of negative 
dimension.) 


ror me iwo-compiex oi ngure m.i we nave 

da = B — A,dp = C — A, etc. 


8x6 matrix. To write down this operator, it was essential to have specified the 
orientation of each branch by an arrow on the diagram. Similarly in order to specify 


each face, either clockwise or counterclockwise as shown on the diagram. The 
boundary of a face is the closed path around its perimeter, traversed in the direction 
determined by the orientation of the face. In figure 14.1, for example, d (Red) = 
a + y — /? while d (Blue) = y + v — e — 5. As a map from C 2 to C t , the boundary 




zero boundary, i.e., the kernel of d: C k -^C k - l . In the complex of fi gure 14.1, for 


example : 

Z 2 is the zero subspace, because no non-trivial linear combination of Red 
and Blue has zero boundary; 

Z x is the three-dimensional subspace spanned by the meshes, for which a 
basis is 


Mi = a + y — /?, M 2 = y + v- e — <5, M 3 = X — g — v\ 

Z 0 is the entire space C 0 , since by definition dc — 0 for any zero-chain c. 


Similarly, we define the space B k of fc-boundari e s to b e th e subspace of C k consisting 
of elements which are boundaries of elements of C k+1 ; i.e. the image of d:C k+1 -*C k . 
For the complex of figure 14.1; _ 


B 2 is the zero subspace, because there are no three-dimensional elements in 
a two - complex (similarly, in the one - complexes which we have considered, 
the subspace B x was always the zero subspace); 

i?! is the two-dimensional space spanned by d (Red) = a. + y — /? and 
d (Blue) = y + v — s — S; 

B n is a five-dimensional subspace, exactly as if the complex were a one- 
complex rather than a two-complex. 


In this example, each space B k is a subspace of the corresponding Z k ; i.e.. every 
boundary is a cycle. A moment’s thought will convince you that this property must 
hold for any two-complex defined by a diagram like figure 14.1. Since B 2 is empty, it 
is trivially a subspace of Z 7 . Since the boundary of any polygon is a closed path, B , is 
a subspace of Z x . Finally, since Z 0 is the entire space C 0 , B 0 is trivially a subspace of 
Z 0 . The condition B k cz Z k can be stated somewhat differently. Since B k is the image 


of d: C 


•C t and Z t is the kernel of 3: C t -> C 


Z k means that the 


k+l 


fc -1 » 


B„ 


composition d°d: C k+1 -+ C k - 1 is zero. To put it in words: the boundary of a 
boundary is zero. In th e examples which follow, it will b e com e cl e arer why this is so 








Homology 


Because B k is a subspace of Z k , we can form the quotient space H k = Z k /B k . H k , 
called the kth homology space of the complex, will turn out to be of more 
fundamental significance than either Z k or B k individually. Whil e Z k and B k relate to 
properties of a specific complex, H k , as we shall see, relates to properties of the space 
built up from the elements of the complex. 

W e have already met the spaces H 0 and in our study of networks. Recall that 
the space H 0 has a dimension equal to the number of connected components, and 
that its dual space H° can be interpreted as the space of potentials which are 
constant on each connected component. Since the space B x of boundaries is empty 
for a one-complex (there are no two-dimensional elements), if t = ZJB X = Z^. Thus, 




Figure 14.2 





for any one-complex, H t is just the space Z x of cycles which played such an 
important role in Maxwell s mesh-current method. 

Let us now consider the spaces H k for the two-complex of figure 14.1. Since there 
are no two-cyc les, Z 2 is {0}, and so is H 2 . Sinc e is three -dimensi onal, while B r j s 
^>nly two-dimensional, the quotie nt space H 1 — ZJB t is one-dimensional. Looking 
at the figure, we see why: the path X — p — v is not the boundary of any element of C 2 . 
If the triangle CEF were added to the complex as a third basis element of c,, 
then X — p — v would become a boundary, and H x would have dimension zero. 
Finally, because the complex is connected, the space H 0 is one-dimensional. 


As an example of a three-complex, consider a solid tetrahedron with all its faces, 
edges (branches), and vertices (nodes), as shown in figure 14.2(a). The surface of the 
tetrahedron, unfolded , is shown in figure 14 . 2(b) . There a re four vertices, A , B, C,D, 
and six edges, a,p,y,3,s ,? / . The r e are four faces, labelled Red, Yellow, Green, and 
Blue in the figure. Each face has been assigned an orientation, described by the 
circulating arrows in the figure. These orientations have all been chosen so th at 
the arrows circulate counterclockwise when the tetrahedron is viewed from the 
outside (see Yellow and Green in figure 14.2(a)) and clockwise when the tetrahedron 
is viewed from the inside (see figure 14.2(b)). 

From the figure it is easy to read off the boundary operator from C 2 to C l5 for 
example: 


5(Red)= -<x-p-y, 

8 (Green) = « + s — 3. 

Since the boundary of each face is a cycle, each such boundary itself has zero 
boundary, for example: _ 


5[5(Green)] = d(oc + s — S) = (B — A) + (D — B) — (D - A) = 0. 

Finally, there is one three - dimensional element in the complex, the solid 
tetrahedron itself. In order to write down the boundary operator from C 3 to C 2 , we 
must assign this element an orientation. It is no longer an easy matter, though, to 
depict this orientation on a diagram: we must do it more abstractly. We will choose 
the so-called right-handed orientation, which means that, when the element is viewed 
from outside, all of the faces which comprise its~~boundary appear with a 
counterclockwise orientation. The reason for the term right-handed is that if you 
were to place your right hand on the surface of the tetrahed r on, with the thumb 
pointing outward, then the fingers of the right hand would circulate in the sense 
appropriate to a face which appears with a plus signs in the boundary. Of course, this 
is stated much more succinctly in algebraic terms by writing down the boundary 
operator from C 3 to C 2 : 

d(Tetrahedron) = Red + Yellow + Green + Blue. 

If a left-hand orientation had been chosen for the tetrahedron, then all four faces 
would have appeared with negative signs in the e xpression for the boundary of the 
tetrahedron. Likewise, reversing the orientation of any face would change its sign in 
the expression for the boundary. 





It is obvious that the boundary of the tetrahedron, a closed surface, has zero 
boundary. This may readily be confirmed algebraically by using the expressions for 
t b e boundary operator from C 3 to C 2 and from C 2 to C t : 

did (Tetrahedron)] = d[Red + Yellow + Green + Blue] 

= ( — a — /? — y) + Xfi + rj — e) + (a + e — 3) + (y + 5 — t]) = 0. 



This result in no 
for any of its faces or edges! 
couple of edges just to appreciate why this is so.) 

So f ar our complex es hav e al l bee n defined initia lly by f ig ures, and w e have 


translated the figures into an algebraic specification of the boundary operator d. It is 
apparent that the boundary operator determines the complex as completely as the 
figure does, and it is tempting to try to specify a complex without drawing a diagram, 
simply by listing the elements and writing down the boundary operator. When we do 

so, we shall insist that for any element c of the complex d(dc) = 0. _ 

Let us see why the condition d{dc) = 0 is a reasonable one to impose on any 
complex, even, for example, a surface which cannot be assembled in three- 
dimensional space or a complex whose elements include four-dimensional regions of 
spacetime. To verify that d (dc) = 0 holds for all c, it is sufficient to verify that it holds 
for the generators, i.e., for the elements of S k . Now for any such element, we can 
construct the subcomplex consisting of this element, all the elements of S k ^ 1 which 
enter non-trivially into 



component we started with. So, although the interconnections between the various 
^uildingJjlocks migh t be c o mp li ca ted , we e xpect th at th e individ ual elements out of 
which we are building our complex are reasonably familiar geometrical objects. For 
example, we might want to assume that every element of S 3 looks like some convex 
polyhedron in three-dimensional space (at least as far as its relations with its 
boundary are concerned - we will allow for all kinds of continuous geometrical 
distortion in the actual shape of object, much as we did not care about the shape of 
the wires in studying circuit theory). Now the boundary of such a polyhedron will 
consist of a finite number of polygonal faces, each of which can be oriented 
counterclockwise when we look at the polyhedron from the outside. When we 
c ompute the su m of the b oundaries o f these polygonal faces, each ed ge is shared by 
exactly two faces and they induce opposite orientations, so the sum of the 
boundaries is zero. Thus the boundary of every element of S 3 is a cycle in this case. 
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Similarly, we might be willing to assume that every element of S 2 looks like some 
convex polygon in the plane. It is then clear that the boundary of any element of S 2 i s 
a cycle (in fact it corresponds to a mesh). 

Let us investigate the homology spaces of the complex of the tetrahedron. 
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dirnffo = 1. 

To determine H x , we may consider the two-complex determined by the vertices, 
edges, and faces. We can analyze the space Z x by the method already familiar from 
our study of electric networks. Referring to figure 14.2, we see that branches b , e, and 
r\ form a maximal tree. Each of the remaining branches determines a n 

independent cycle and so dimZ x = 3. Each of these cycles is the boundary of a 



space Bt <= C 2 , is three-dimensional. Since C 2 is four-dimensional, the kernel of d is 




Blue, the surface of the tetrahedron, which clearly has zero boundary. Of course 
this element belongs also to B 2 , since it is the boundary of the solid tetrahedron. 

Hence d imB 2 = 1, dim Z 2 = 1 an d d im H 2 =0 ._ 

Turning our attention finally to H 3 , we note that there are no three-cycles so that 
dim Z 3 = 0 and dim H 3 = 0. 

Euler’s Theorem 

In our study of electrical networks, we proved that 

dim H 0 — dim H 1 = dim C 0 — dim . 

The generalization of this result to n-complexes is Euler’s theorem: 


dim H 0 - dim + dim H 2 -+ dim H„ 


= dim C 0 — dim 


lim C 2 — 


The number given by either side of this equation is called the Euler characteristic of 

tv, A 1 Itr. 1 11 i 1 • 1 . i i . i i i i i 1. . 


complex of the tetrahedron. Since dim H 0 = 1, while the spaces H u H 2 and H 3 are 

(0L 

dim H 0 — dim H 1 + dim H 2 — dim H 3 = 1. 


dim C 0 — dim C x + dim C 2 — dim C 3 = 4 — 6 + 4—1 = 1. 

Suppose, instead of the solid tetrahedron, we consider the two-complex consisting 








0 f the surface of the tetrahedron. The only change is that there is now no three- 
dimensional element in the complex. As before, dim H 0 - 1 and dim H 1 = 0. It is still 
true that dimZ 2 = 1, since the surface of the tetrahedron has zero boundary, but 
0 nw dim B 2 = 0, since there is no three-dimensional element to give rise to a two- 


dimensional element which is a boundary. Therefore dim H 2 = d im Z 2 — -dim B 2 = 1 
in this case. For this complex, then, dim H 0 — dim H x + dimtf 2 = 1 — 0+1=2 
while dim C 0 — dim C l + dim C 2 = 4 — 6 + 4 = 2 also. 

Let us now do a similar computation for the solid cube, shown in perspective in 
figure 14.4(a), with its surface shown unfolded in figure 14.4(b). As before, dim H 0 = 1 
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because the complex is connected, and dim H 2 = 0 because there are no three-cycles. 

To determine the dimension of Z lf consider the maximal tree of seven branches 
which is shown in figure 14.4 (thick lines). Each of the remaining five branches 
determines a cycle. Hence dimZ x = 5, but dim B x = 5, and so dim H x = 0. 

Turning our attention to H 2 , we proceed as for th e t e trahedron. Since dim C 2 = 6 
and dim B x = 5, we conclude that dim Z 2 = 1. A basis for Z 2 is the entire surface 
of the cube. But this surface is the boundary of the cube, so dim B 2 = 1 and 
dim H 2 = dim Z 2 — dim B 2 — 0 . 

To homology spaces of the cube have turned out to be the same as those of the 
tetrahedron. Again dimH 0 - dim + dimH 2 - dimH 3 = 1, and we can check 
that 

dim C 0 — dim C x + dim C 2 — dim C 3 = 8 — 12 + 6 — 1 = 1 also. 


Invariance of homology 

From the point of view of topology there is a fundamental reason why the 
complex of the solid cube and the solid tetrahedron have the same homology spaces. 
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(b) 

Figure 14.5 






The space consisting of the solid cube is the same as the space consisting of the solid 
tetrahedron, which in turn is the same as the space consisting of the solid ball - they 
can all be continuously deformed into one another. Similarly, the surface of the 
tetrahedron, cube, and sphere are all the same space. Figure 14.5(a) illustrates how 
the surface of the sphere may be broken up as a tetrahedron, whi l e figure 14 . 5(b) 
shows how it may be broken up as a cube. Of course the complex of the tetrahedron 
and the complex of the cube are different, but they represent different ways of 
breaking up the same space. 

There is a basic theorem of topology which says roughly the following. Suppose 
we can build up a space X by attaching various cells to one another along their 
boundary cells of one lower dimension. For example, we could build up the surface 
of the sphere by attaching four two-cells together along their edges as in 
figure 14.5(a), o r by attaching six two-cells as in figure 14.5(b). Alte r natively, we 
could build up a three-dimensional cluster of polyhedra joined along their faces, like 
a cluster of soap bubbles. Then, says the theorem, the homology spares of thp resulting 
complex depend only on the space of X and not upon how we built it up; if we can 
decompose X into cells in some other way, to get some other complex, the spaces H k 
of the two complexes will be isomorphic for all k. 

To continue with the*example of the sphere, we can find a decomposition which is 
simpler than either the tetrahedral or the cubical decomposition, as illustrated in 
figure 14.7. The surface is broken into two triangles, corresponding to the north and 
south hemispheres respecti vely, which are joined along the equator. 



For this two-complex'there is only one mesh, a + f + y, and this mesh is clearly the 
boundary of either hemispherical face. Hence 6\mZ x = dim#! = 1 and again 
dim H x = 0. The sum of the two faces has zero boundary, s ince 

<9(North) = a + jS + y 

while 

d (South) = — a — fi — y. 













So again dim H 2 = dim Z 2 = 1 as for the surface of the cube or tetrahedron. Again 
we notice, for this two-complex, that 

dim C 0 — dim Cj + dim C 2 = 3 — 3 + 2 = 2 = dim H 0 - dim H 1 + dim H 2 . 


precisely. To do so we would have to define exactly what we mean by space, deform. 





afield to set up all the necessary concepts in a mathematically correct form, 
yprtheless. let us work with the theorem in the back of our minds, no matter how 


imprecisely formulated, while we do some more computations in order to gain more 
familiarity with the spaces H k . 

The torus 


t .i t• 




those which we have been considering, we consider a decomposition of the two- 
Himensional torus; i.e., the surface of a donut. (Instead of cutting up a beach ball, we 
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four regions; figure 14.7(b) shows these regions flattened out, but still attached; 
figure 14.7(c) shows the regions disassembled, but still labeled so that it is possible to 
reassemble them. We might think of figure 14.7(c) as depicting a ‘torus kit’ and try to 
discover in what fundamental ways it differs from the ‘sphere kit’ which would be 
obtained by cutting apart figure 14.7(a). ~ 

Counting vertices, edges, and faces, we find that dimC 0 = 4, dimC x ~ 8, and 


di m C 2 = 4, 


-4-8 


two-complex formed from the surface of the sphere, this quantity was always 2.) 
We now consider the homology spaces of the complex of the torus. Since the 



only three edges. Take the time to convince yourself that any of the other live edges, 
when added to this maximal tree, completes a cycle! For example, a -I- /? is a cycle, as 
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three of the four faces, for example: 


Sincei/fisthe 


d(Yellow) = rj — /? + e + <5, 
d(Green) = a — X — y — p, 
d(Blue) = A + P + fi — S. 


< 1 / u j , 113 CiCJUit-iiLO~3.Te vaiviK/^ waoow ini/uuiv . 

We must therefore find two cycles which are not boundaries, with the additional 

linear 


boundary. 

figure 14.7(a), we can readily find two such cycles: the cycle a + /? which goes around 
the torus in one direction, and the cycle e + A which goes around the torus in another 
independent direction. A basis for H x therefore consists of the equivalence classes 

a + (3 and e + X. Of course, there are many other cycles which are not boundaries, 
but they all fall into the equivalence class of some linear combination of a + /? 
and e+ X. For example, the cycle y + <5 lies in the equivalence class of a + /? because 

y + S = a + fi + d(Red) + d(Yellow). 

The cycle a + s + 3 — p may be expressed 

(ft + /?) + (e + X) — d(Bluc) 

and so it lies in the same equivalence class as 

ft + + £ + A. 


We are now in a position to understand the general significance of dim H x :it is, 

curves which do not bound any 


region. For the sph e re, any clos e d curve is necessarily the boundary of some region 
and so dim if! = 0. For a torus, on the other hand, there are two distinct types of 
closed curves which do not bound any region, as illustrated by figure 14.9, and so 
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dim/fi -- 2. Notice that for the complex which we constructed for the torus we 
had dimC 0 — dimCi + dimC 2 = 4—8 + 4 = dimH 0 —dimifi + dimH 2 . Let us 
examine how the spaces H l change when we modify a surface by attaching a 
‘handle’. By this we mean that we cut two small disks out of the surface and attach a 
(curved) cylinder, open at the top and the bottom, to the surface by attaching the top 
of the cylinder to the boundary of one of the disks and the bottom of the cylinder to 
the boundary of the other disk. Let us choose a decomposition of our original 



and we 


think of the cylinder as being made up of four sides. Thus in attaching the cylinder we 
add no new nodes, we add four new branches (the edges of the square cylinder), 


remove two former elements of S 2 (the interiors of the disks) and add four new 
elements of S 2 (the four sides of the cylinder). For the new surface we still have that 
H n and H 2 are one-dimensional. In computing Z 1 for the new complex we can 
clearly choose a maximal tree which does not contain any of the four new branches 
(since each of them clearly lies on a mesh - recall our procedure for constructing 
maximal trees in our study of electrical networks). Thus we can construct a maximal 
tree consisting entirely of branches of the complex of the old surfaces. Each branch of 
the cylinder then gives a new independent mesh, and we have increased the 
dimension of by 4. On the other hand, we have increased the dimension of C 2 , 
and hence of By , by 2, and so we have increased the dimension of H t by 2. Thus we 
have shown that, each time we add a handl e , we incr e ase th e dimension of H 1 by 2. A 
sphere with k handles attached has dimH 0 = 1, dim H x = 2k and dim H 2 = 1. For 
example, we can think of the torus as a sphere with one handle attached, giving us 
another proof of our preceding results for the torus. 













example, a complex in the plane in the form of a square together with its interior. For 
such a complex it is clear that dimff 0 = 1 while dimff^O and dimff 2 = 0. 
Suppose we cut out a hole from the interior of the square. If we had a decomposition 
of our original square for which the hole we removed was one of the elements of S 2 , 
we see that the process of removing the hole does not change Z l9 but does decrease 
the dimension of C 2 , and hence of , by 1; and hence increases the dimension of H l 
by 1. Thu s, if we have a region in the plane with k holes cut out, dim H l = k > while 
dim H 0 — 1 and dim H 2 = 0. A similar argument in three dimensions shows that if we 
have a region in three-dimensional space with k (three-dimensional) holes cut out 
then dim H 0 = 1, dim H x = 0, dim H 2 = k and dim if 3 = 0. 


The Klein bottle 

Yo u may ha ve become aware, in your study of electrical networks, that there are 
networks, like the one shown in figure 14.14, which cannot be assembled in the plane 



without having wires cross. Such non-planar networks can, however, always be 
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called Klein bottle , a surface which gives rise to the complex of figure 14.15(a). In 













8 and A together with their correct orientations. For the complex of the Klein bottle, 
the ho mology spaces are somewhat different than for the torus. Of course, dim H 0 = 
1 b e cause the complex is connected. However, dim H 2 = 0, because there is no 
element of C 2 which has zero boundary. (The boundary of the entire flattened out 
complex of figure 14.15(a) is 28 + 2/, and you can readily convince yourself that 
changing the sign of one or more two-cells will still not yield an element of C 2 whose 
boundary is zero.) As for the complex of figure 14.7, dim Z x — 5, but, for the Klein 
bottle complex, dim B^= 4 and dim = 5 — 4=1. This means that there is only 
one independent cycle which is not a boundary. The cycle a + ft is not a boundary in 
















It is reassuring to notice that, for the Klein bottle case, 


dim H 0 - dim H 1 + dim H 2 = 1 - 1 + 0 = 0 also. 


Proof of Euler’s theorem 
We now prove the general result 

dim H n — dim H 1 + dim H? — • • • + dim H v = dim C 0 — dim C 1 + • • • + dim C„ 

which we have checked in many special cases. The ingredients of the proof are 
simple. 

(1) Since H k = Z k /B k , dim H k = dim Z k - dim B k . 

(2) As a consequence of the rank-nullity theorem applied to d:C k + 1 -+C k , 
dim C t+1 = dimZ fe+1 + dim.B fc . 

(3) For an n-complex, dimJ3„ = 0 because there are no (n + l)-dimensional 
objects. 

(4) dim Z 0 = dim C 0 because the boundary of any zero-dimensional cell is zero by 
definition, there being no objects of negative dimension the complex. 

Combining (1) and (2) we find 

dim H k — dim Z k — dim C k + x 4- dim Z k + x 

for l ^k< n. For k = 0 we have, by virtue of (4), 

dim H 0 = dim C 0 — dim C x + dim Z x 

wh i le for k — n we h ave , by vi rtue of (3), ~ 

dim H n — dim Z„. 

- Thus - 

dim H 0 — dim H x + dim H 2 + ■■■ + dim H„ 

= (dim C 0 — dim C x + dim Z x ) — (dim Z x — dim C 2 + dim Z 2 ) 

+ (dim Z 2 — dim C 3 + dim Z 3 ) — ■ • ■ ± dim Z„ 

= dim C 0 — dimC t + dim C 2 —■•• + dim C„. 

As an application of this result, we consider a convex polyhedron, which we 
regard as a compl e x formed by cutting up th e surfac e of th e sph e re into c e lls. If we 
believe the theorem quoted earlier to the effect that the dimensions of the spaces 
H n ,H x < and H 2 are the same for any complex formed by decomposing a given 
surface, we can deduce that dim H 0 — dim H x + dim H 2 = 1 — 0 + 1 = 2 as we found 
earlier for the cube and tetrahedron. It follows that, for any convex polyhedron, 

dim C 0 — dim C x + dim C 2 = 2 
or 

number of vertices — number of edges + number of faces = 2. 



As we have said above, we have not given a precise definition of the notion of the 
underlying space of a complex. But here is how we would like you to think about the 



situation. Each fc-cell, that is, each element of S k , should be thought of as a bounded 
convex polyhedron in lR k with a definite ori e ntation. (Rem e mber that we can choose 
one of two orientations on U k , and hence on any open subset, by choosing an (ordered) 
basis.) Now the boundary of this convex polyhedron will be a union of finitely many 
conv ex polyhedra of dimension k — 1, lying i n some affine subspace of dimension 
] < — \. By a choice of origin and basis we may identify each such subspace with U k " 1 . 
We assume that, after we have made such identifications, each of the corresponding 
(/< — l)-dimensional polyhedra corresponds to a {k — l)-cell in our complex, that is, 
to an element of S k _ 1 . But the elements of S k _ x come with orientations. So we must 
see how the orientation of a k -cell relates to the orientation of the (k — l)-cells on its 
boundary. This information, of course, is coded into the d operator. 

L et v j ,..., v fc be a basis of U k chosen so that v 2 ,. ■ ■, v fc is tangent to a specified (fc — 1 ) - 
dimensional face of our k-cell C, and so that v x points out of the fc-c c ll. Suppose that the 
orientation of U k determined by v l5 ..., \ k coincides with the orientation of C. Then 
Y 23 ...,v fc determines an orientation of the particular k— 1 dimensional face 
corresponding to, say, F czS k ^ x . Now this orientation may or may not coincide with 
the orientation of F. If it does, we will assign a 4- sign to F, otherwise, a — sign. Thus 

dC = Y J ±F 

where the sum is over those F corresponding to the boundary k — 1 faces of C. If the 
basis v l5 ..., \ k corresponded to the opposite orientation of C, the signs would all be 
reversed, of course. 

If k 5? 2 we could replace v fc by — v fr i f necessary to get a ‘good’ bas i s. For k = 1 the 
space U k ~ 1 is just the zero vector space. An ‘orientation’ here is just a choice of + or 
— sign. Then our rule says to pick the + sign if the vector giving the orientation 
points out and the — sign otherwise: 

Vx + 

Of course, each (k — l)-cell might occur in the boundary of several /c-cells. But since 
the /c-cells form a basis of C, defining as above for each k-cell determines the linear map 
d: Q-CV,. 

To check that d°d = 0 it is sufficient to check that d(dC) = 0 for each /e-cell. Now 
every (k — 2)-cell E that can occur in the expression for d{dC) occurs because it is the 
boundary of exactly two (k — l)-dimensional faces, say F t and F 2 * of C. You should 
now check your understanding of the notion of orientation and our definition of d to 
see that d{dC) = 0. 

Having realized each element of S k as a convex polyhedron, we can define a smooth 
map of the complex as a whole into U N . This will be a rule / that assigns to each point of 
each k -cell a point of 1R N . Whenever we have made one of our identifications of a /c-cell 
C as a convex polyhedron in U N , the map /, restricted to C, can be thought of as a 
map from a convex polyhedron in IR fc to U N . We want this map to be smooth. In fact, we 
want it to b e the restriction of a map g defined in a slightly larger region (so as to include 
C and all its boundary in the interior) where g is smooth. We will study this in 
more detail in Chapter 15. 








Higher-dimensional complexes 


14.2. Dual spaces and cohomology 

We turn our attention now to th e collection of dual spaces to the spaces C k . We 
denote these spaces by C k . Thus, an element of C’\ which we call a k-cochain, is 
a linear function of the elemen t s of C t , which we call fc-chains. If a is a fc-cochain 
and c is a fc-ch a in, we shal l denote the value of the cochain a on the ch a in c by 
just as we have already done in the cases k = 0 and k = 1 while considering electrical 
networks. 

We can now consider the transpose, d, of the boundary operator d: C k+l -> C k , 
which we shall call the coboundary operator. Thus d: C fc ->C fc + 1 is defined by the 
formula 



da = 

/» 

a where g~eC fc and ceC >c+1 . 


c_* 

dc _ 


It follows directly from the fact that the boundary of a boundary is zero that 
d°d is zero. (Of course, this is the composition of d: C k -» C k+1 with d: C k+1 -» C k+2 .) 
In the notation just introduced, 



_J/J —V_ 

__ 

* 


d(drr) = 

'c * 

da = 

3c •) 

fr — 0 since — 0. 

d(dc) 


In order to gain a feeling for the meaning of a cochain and the significance of 
the<?oboundaryoperatordweconsiderthetwo-eomplexoffigurel4.16.Anelement 
of C°, a zero-cochain O, is defined completely by specifying its value at each node, 
for example 

= 1, <D b = 3, O c = 2, <t> D = 5. 



The potential functions which figured prominently in electrical network theory 
are examples of zero-cochains. 

The action of the coboundary operator d on a zero-cochain yields a one-cochain 
dO. To compute the value of d<D on a branch, we use the fact that d is the adjoint 
of (1 For example, since da = B - A , f a d<E = f gg 0) = O g —® A . As is already familiar^ 
the operator d:C°—►C 1 converts a potential function, which is a zero-cochain or 
linear function defined on nodes, into a voltage rise function, a one-cochain defined 
as a linear 




We can now proceed to investigate the significance of d: C 1 ->-C 2 . An example 
of a onc-cochain might be a linear function W with values such as 

W a = 1, W p = 2, W y = 1, W 6 = 2, W e = - 3, W* = 2. 

The battery voltages in an electric network defined such a one-cochain. 

Now the effect of d on a one-cochain W is to yield a two-cochain dW which is 
a linear function on two-chains. We can compute the value of dW on the two cell 
Red, for example, by using the definition 


1 

/% 

dW = 

f 

W 

J 

Since d (Red) = a + /f - y, we ha> 

Red J 

/e 

S(Red) 


/• 


dW = W* + W fi - W y = 1 + 2 — 1 = 2. 


J Red 

Similarly 



dW =W d -W £ +W’’ = 2 + 3 + 2 = l. 


Blue 


Now that the values of dW on a basis for C 2 are known, its values on any element 
of C 2 may be found by linearity. 

As a final example, consider the three-complex of the tetrahedron shown in 
figure 14.17. Let R, Y, G, and B denote the four faces of the tetrahedron, all oriented 



counterclockwise as viewed from the exterior. Then a two-cochain T is specifledT 
by its values on the faces, e.g., T R = 1, T Y = 2, T G = 4, T B = — 1. The effect of 
d: C 2 -»C 3 on the two-cochain T is to yield a three-cochain dT defin e d on the 
elements of C 3 . We need only to compute the value of dT on the solid tetrahedron. 
If the tetrahedron is given a right-handed orientation, so that 

0 (Tetrahedron) = R + Y + B 4- G, 

then 



f* 

dT = T R 4- T y 4- T° + T b = 1 + 7 + 4 - 1 = 6. 

% 

Tetrahedron 




may already have noticed 


the similarity between the coboundary operator and certain operations of signi- 
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ficance in physics. For example, the two-cochain T might have specified the flux of 
the electric field through the various faces of a tetrahedron, in which case the 
three-cochain dT would be, by virtue of Gauss’s Law, proportional to the total 
electric charge within the solid tetrahedron. In the earlier example, if the one- 
cochain W h a d represented voltages induced by a changing magnetic field, then 
the two-cochain dW would be, by virtue of Faraday's Law, proportional to the 
rate of change of magnetic flux through each of the various two-cells. Indeed, the 
coboundary operator d is going to evolve into a differential operator that w ill 
permit a concise statement of all of Maxwell’s equations. 

We can now consider the image and kernel of the coboundary operator in order 
to construct subspaces of the spaces C k . To be specific, the image of d: C k ~ 1 
is called B k , the subspace of coboundaries, while the kernel of d :C k -*-C k+1 is called 
Z k , the subspace of cocycles. Since d°d — 0, any coboundary is necessarily a cocycle, 
and B k is therefore a subspace of Z k . We form the quotient space H k = Z k /B k called 
the kth cohomology space of the complex. 

Many aspects of the construction of these spaces have appeared already in our 
study of electrical networks. In the case k = 0, for example, the space Z° (zero- 

each connected component of a network. The space B° was empty, since there is 
no way f or a zero-cochain to lie in the image of d. As a result, the quotient space 
H 0 ~ Z°/B () ~_Z°. _(We_used_the notationi/ 0 exclusively in discussing, this space 
earlier.) 


For a one-complex, the space Z 1 is the entire space C 1 , for the simple reason 
that, with no two-cochains available, every one-cochain must lie in the kernel of 
d. The space B l of coboundaries contains those voltage distributions which are 
derivable from a potential. The quotient space H 1 =Z 1 /B 1 is, in this case, also 
the space C 1 /# 1 . You will recall that the space H 1 is dual to the space (for a 

one-complex, the same as Z L ). --- 

The simplest example in which Z k , B k , and H k are all non-trivial subspaces is 
the case k— 1 for a two-complex. Consider the complex shown in figure 14.18. 


a 








Here, since there are six branches, the space C 1 is six-dimensional. To determine 


Consider first the image of d: C 1 -*■ C 2 , the space B 2 . Since C 2 is two-dimensional, 
? 2 cannot have dimension greater than 2. To show that its dimension is 2, it is 




V which assigns 1 to branch a and 0 to all other branches has the property that 

dV(Red) = 1, dV(Blue) = 0. 

Similarly, the one-cochain W which assigns 1 to branch /? and 0 to all other 
branches satisfies 

dW(Red) = 0, dW(Blue) = 1. 

Therefore B 2 , the image of d, spanned by dV and dW, has dimension 2, and by 
the rank-nullity theorem the kernel of d, the space Z 1 , satisfies 

dim Z 1 = dim C 1 — dim B 2 = 6 — 2 = 4. 

Turning our attention now to B 1 , we consider d:C°^C 1 . The kernel of this 
operator, the space Z° (which is the same as H°) has dimension 1, because the 
complex is connected. Therefore 

dimB 1 = dim C° - dimZ° = 4- 1 = 3. 

That is, there are three independent one-cochains which are derivable from 

potentials._ 

We see now that the space Z 1 of one-coeycles has dimension 4, while the space 
B 1 of one-coboundaries has dimension 3. Therefore dim H l = dimZ 1 — dim!? 1 = 

of d even though they cannot be derived from a potential. The equivalence class 
of one such cocycle will provide a basis for the one-dimensional quotient space 
Z 1 . Let us choose, for example, the cocycle V for which V a = j,V l) = j, V y = 

V 6 = V s = 0, V = 0. This is clearly not derivable from a potential, because the 
sum of the voltage drops around a cycle such as a + j? or y + d is not zero. On 
the other hand, V lies in the kernel of d:C 1 -»’C 2 , since dV(Red)= V a —V d = 




an mie 




is intimately related to the presence of a hole in the complex. If the disk bounded 


KlftfSTTSETiR “ nsVaSTSRFSTRRi PSVfSEll RiSvSI rSWi* a tTSTIWa TTSmSui pVSRtSvm rSCmv 


be the coboundaries; the cochain V considered above would have dV(Yellow) = 1, 
for example, and would not be a cocycle. In this way the analysis of the cohomology 
spaces of the complex reveals the presence of the hole just as the analysis of the 
homology spaces would have. 

In this example, the space H 1 is readily seen to Tie dual to H t in the sense that 
the basis element V assigns 1 to any cycle which encircles the hole once in a counter- 
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counterclockwise constitute an equivalence class which can serve as a basis for 
H v Furthermore, the specific choice of the element V as a representative of its 
equivalence class did not matter. Suppose we had used w = V + d<I> instead. Then 
for any cycle M 



% 

\y _ 


A® 

r* - 

\7 i 


A- 

y_ 


▼▼ — 

M 

M 

UM/ — 

M 

v + 

M 

dM 

T 

M 


since d M = 0 if M is a cycle. In this sense, the equivalence class V (a basis for H 1 ) 
determines a linear function on the equivalence class T (a basis for H^.To evaluate 
the function, we evaluate JjV, using any member of either equivalence class. 

Let us now proceed to prove that H k can be identified in general with the dual 
space of H k . We shall first establish that dim H k = dim H k , then show that any 
element of H k d e fin e s a l inea r function on th e spac e H k . 

Everything will follow from the now-familiar result: the kernel of the adjoint 
annihilates the image; the image of the adjoint annihilates the kernel. Con si Her 
the following diagram: 

C k ~ 1 A. C k ^>C k+1 


d 8 

Qc-l Ck*~Ck+ 1 


WedbokfinTf atdVC k + 1 -*■ C k andTts adj bin tTlTC*-^ C k+1 .TKekernel of the adjoinF 
d is the space Z fe of cocyles; it annihilates the image of d, which is the space B h of 

att e ntion n e xt to d: C k -» C t _, and its adi oint d: C k ~ 1 -» C k . 


7 k~ r Mc-l 

The image of the adjoint is the space B k of coboundaries; it annihilates the kernel of d, 
which is the space Z k of cycles. 

But we know that, for any subspace of C k , the dimension of the subspace plus 
the dimension of its annihilator space must equal the dimension of C k . Hence 

dim B k + dim Z k = dim C fc = dim Z k + dim B k . 


It follows that 

i.e., that 


dim Z k — dim B k = dim Z t — dim B 


k’ 


dim H k = dim H b 


since 


H k = Z k /B k and H k = Z k /B k . 

— A typic a l element of H k is the equivalence class V consisting of cocycles of the 
form V + d<D. If V is different from the zero element, then V is not a coboundary. 
Similarly, a typical element of H k is the equivalence class T consisting of cycles of 
the form I + dr. If I is different from the zero element, then I is not a boundary. 
We define the value of V acting on T, J T V, by letting any element of the 





class V act on any element of the class I. Thus 


V = V + dO = V+ dO+ V+ dO 


) dx J dx 



A U t dl = Q because I is a cycle, dV 
d(dO) = 0. Hence 


because V is a cocycle, and of course 



Tn other words, it makes no difference which element of the equivalence classes 


functions on H k . This completes the identification of H k with the dual of H k . 

It was mentioned previously that the homology spaces H k associated with IT 
complex depend only upon the underlying space and not upon how it was cut up 


’ It 


mology spaces H k . Because the cochains which give rise to the H k are closely 
related to quantities of physical significance, these spaces will eventually reveal 
the impact of topology on el e ctromagn e tic th e ory. 

Before leaving the subject of complexes, it is worthwhile to summarize everything 
which we have studied in a single diagram, figure 14.19. While drawn for a three- 

1 h h j ^ 




annihilates B 0 


Z 1 annihilates B 1 


annihilates B, 


B° is empty B 1 annihilates Z t B 2 annihilates Z 2 

H 0 = Z° dual to H 0 = C 0 /B 0 H 1 = Z'/B 1 dual to H, = Z 1 /B 1 H 2 = Z 2 /B 2 dual to Hi = Z 2 /B 2 
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Higher dimensional complexes 



Given diagrams that represent the cells of a two-complex, you should be able to 
;termine whether the complex corresponds to a snhere. a torus, or somethin 


Cohomology and the d operator 


I rmiETaWIMa/al ■ al 


IffailB'J ■ ral lirtB aTlXIml 


I ildl IUI ariJ tL*m fai*&*fe iTtBri«f 


You should understand how to prove and apply the duality of H k and H 



Exercises 



14.1. The five two-complexes in figure 14.20 represent (in some order): 

(a) the curved surface of a cylinder; 



(b) a Mobius strip (a cylinder with a twist in it); 

(c) a sphere; 

(d) a torus;_ 



(e) a Klein-bottl&— 

Identify them, and calculate dim Hi and dim H 2 in each case. Find 
bases for Z l5 B u and H l in each case, and check that dim C 0 — dim C t + 





Figure 14.20 


14.3. An inflatable rubber swimming-pool toy has been divided into three 


tummgiiifaaMiiMataTiitaEwaimvKiivftjiiBiirjiMiiMiM 


(a) Determine <3(Red) and d(Blue). 

(b) Determine the dimension of Z 2 , H 2 , and H 0 . 

(c) Find bases for Z u B x , and H { . 




















A 



Figure 1 d.27. 



(d) Figure 14.22 shows the third region: 

After the complete toy has been assembled, what are the dimensions 


of C 0 ,C 1 ,C 2 ,H 1 , H 2 1 Is the assembled toy a beach ball (dim H 0 — 1, 
dim//! = 0, dim// 2 = l), a life ring (dim H 0 — 1, dim Hi =2, 
dim = 11. or something else? 



ile-tire manufacturing company have - 
•e into three oriented regions, and each 
one region. The resulting two-complex 

— 

inree employees oi an - duiomoD 
divided the surface of an inner tub 
employee has attached his name to 


is snown in figure 14.23. 
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Figure 14.23 


(a) What is the dimension of // n ? 



(b) Determine the dimension of H 2 , and write down a basis for this space. 
(The two-cells are named Al, Ben, and Clem; just form an appropriate 
linear combination but be careful witb orientation ) What does your 




(c) Determine dimC 0 , dimC l5 and dimC 2 , then calculate dim H v 


(d) Write down a basis lor Z u a basis lor B 1 and a basis for the quotient 


space Hy = ZJB V 
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(e) Here are three one-cochains, elements of C 1 : 


_ct(4 == 1-t(«) = 1 

oiP) -1 m = -1 

c(y)=2 r(y) = 2 
4 ( 3)=-2 t (( 3 ) = 2 
4 £ ) = — 2 x(e) = — 2 

44 = 1 t(A) = — 1 


co(a) = 1 

co(jS)=-1- 
oKy) = 2 
aj(<3) = - 2 
co(e) = 2 
co(A) = — 1 


Identify the cochain which is not a cocycle and determine the two- 
cochain which is its coboundary (e.g., dQ(Al) = 1, dQ(Ben) = 3, 
dQ(Clem) = 2). 

(f) Identify th e cochain which is a coboundary and write down a z e ro- 
cochain / whose coboundary it is. 

(g) Show that the remaining cochain is a cocycle but not a coboundary. 
Evaluate it on your two basis elements for H 1 . 

(h) Construct a second cocycle which is not a coboundary and which will 
serve as a second basis element for if 1 . 

(i) Write down a basis for B 1 and a basis for Z 1 . 

14.5. G.J. Caesar, special envoy of Galactic Rome, was sent to survey the space 



in partes tres’, included the three maps shown in figure 14.24, which show 
the portions of the space station controlled by the Aquitani (Aqu), Belgae 
(Bel) and Celts (Cel) respectively. These warring tribes have been unable to 
agree on a consistent orientation for the entire surface of the station. 



5 






Answer the following questions about the New Gaul complex: 

(a) Calculate the boundary of each region; d(Aqu), etc. 

(b) Find a maximal tree. Using this tree, calculate dimZ 1? dim B t , and 

dimHj, and construct a basis for H t . Express u — y + S — r] 
and + y in terms of this basis. 

(c) If to is a one-coboundary (element of B ' ) satisfying of = 2, a/ = 1 , then 
what are the values co y and co”? 

(d) If t is a one-cochain whose value on each branch is 1, then what are the 
values dx(Aqu) and dr(Bel)? 

(e) Calculate dimZ 1 . If a is a one-cocycle (element of Zj) with o a = 3, 
(r p = 1, a 6 = 2, G n = 4, what are the values a y and <t £ ? 

(f) Construct a basis of H 1 which is dual to the basis for from part (b). 
Express the equivalence class a in terms of this basis. 

(g) Could the surface of New Gaul be a sph e re? A cylinder? A to r us? 
14.6. A colony on a small planet is divided into two provinces, shown as Red 

and Green in figure 14.25. 


x 






(a) Calculate d(Red) and d(Green) and use the result to determine the 
dimension of H 2 for this two-complex. Interpret your result to 
determine whether or not the colony occupies the entire surface of the 
planet. 

(b) A maximal tree for the complex consists of branches a, and S. Find 
bases for the spaces Z x , Bj and H v Interpret your basis element for H 1 
geographically. 

(c) Define the space B 1 . Of what space is it the annihilator space? Prove it. 
For the remainder of the problem, consider a one-cochain W for which 
W a = X -1 T W d = 1. 

(d) Find values of W y , W e , W * so that W is a coboundary. 

(e) Suppose that W y = 3, W £ = 2, W n = 1. Calculate dW. 

(f) Find values of W y , W E , W* 1 such that W is a cocycl e but not a 
cobounda r y. Evaluate this cochain on you r basis fo r H v 



Figure 14.26 


14.7.(a) Find the dimension of the spaces C i ,C 2 ,Z u B l , and H l for the complex 
illustrated in figure 14.26(a). 

(b) Find the dimension of H 0 and H 2 and verify 

dim C 0 - dimCj + dim C 2 = dim H 0 — dim H t + dim H 2 . (14.1) 

(c) Let a new branch a and a new node C be added as shown in figure 
14.27(b). Write down the matrices d: Cj -> C 0 , <3:C 2 ->C, and compute the 
matrix d°d:C 2 ->C Q explicitly. 

(d) N ow a cut is made along branch a separating it into <r t (on the Green side) 
and <t 2 (on the Blue side). Write down d:C 1 ->C 0 . Find the dimension of 
H 0 by constructing a basis. What does this say about the network? 
Const r uct a basis of H 1 and ve r ify (14.1). 

(e) Let t be a one-cochain with the value 1 on all branches between distinct 
points, zero otherwise. What are dr (Green) and dr (Blue)? 

14.8. The two p i eces shown in figure 14.27 define a two-complex. 

(a) Calculate <3(Front) and 5 (Back), and use the result to determine 
dimZ 2 . 






(b) Find a maximal tree for the complex, and use it to determine dim Z u 
diml?,, and dim H ,. 

(c) Find a basis for H y and express y and a + <5 + j3 in terms of the basis. 

(d) Let co be a one-coboundary satisfying a/ = 2. Calculate of. 
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Chapters 15-18 develop the exterior differential calculus as a 
continuous version of the discrete theory of complexes. In 
Ch a pter 15 the b a sic f a cts of the exterior calculus are 
presented: exterior algebra, /c-forms, pullback, exterior de- 
rivative and Stokes’ theorem. 


Introduction 

In our previous consideration of complexes we have avoided making use of the 
fact that a complex might be situated in IR" We ignored the shapes of the wires in 
electrical networks, and even the question of whether a given network could be 
constructed in the plane. Now we turn our attention to the special case of a 
complex situated in IR" in order to see how cochains can arise naturally from 
physical or geometrical considerations. 

Think first of a one - complex, such as figure 15.1, situated in IR 3 . The nodes 
points in IR 3 ; the branches are differentiable curves in IR 3 . Since a zero-cochain is 
determined once we know its value on each node of the complex, we can regard 











operator which acts on a zero-cochain <J> to produce a one-cochain d<D, and as a 
differential operator which acts on a function to produce a differential form d <j). 
It is now clear that this use of d is consistent in the sense that if the function 
gives rise to the zero-cochain <h, then the one-form d0 gives rise to the one-cochain 

space * 














three-forms, which will give rise, by suitable processes of integration, to two- 



in a manner consistent with the action of d on the corresponding cochains. 

this program might work out. An incipient two-chain will have to assign a number 
to each oriented region on the plane; the value assigned to each cell Q in a complex 
wilLthen define a two-cochain. Now we can proceed from a one-form {incipient— 
one-cochain) co to a two-cochain T in two different ways, as suggested by figure 15.2: 
consider, for example^ the path « + + y which is the boundary of the cell Q. One 
approach is to form a one-cochain W from co by evaluating its path integral along 
each branch of the complex, then apply the operator d: C 1 -> C 2 . This yields 


T = I dW = W=W a +W l, + W v = co + co + 
Q JQ JdQ Ja Jp 


*■ 

two-form t, which could then be integrated over the region Q to yield the value 
of the two-cochain T on the cell Q. 

In this case Green’s theorem in the plane tells us what form the operator d must 


nas oounaary cQ, then the path integral 


" fdB 

dA\ 

JcU* 

dy) 


Thus it seems promising, in this special case, to introduce two-forms of the form 
r =f(x,y) dx a dy. Such a two-form gives rise to a two-cochain by the process of 
iterated integration, and a two-form can be obtained from a one-form by the rule 
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15.1. Exterior algebra 


fhe one case in which we already have a differential operator that is perfectly 
consistent with the coboundary operator is the case of zero-cochains and one- 
cochains. An incipient zero-cochain is a differentiable function </>, which assigns a 
real number to every point. An incipient one-chain is a one-form co, which assigns 
to e very point an element of the dual space V* of whatever underlying vector 
oa ce V we are using. A function e/> belongs to the space Q° (F); a one - form co 
belongs to the space Q^F), and we have found an operator d:Q°(F)-»Q 1 (F) that 
is consistent with the coboundary opera tor. 

We shall now construct new spaces Q 2 (F), Q 3 (F),...,Q fc (F), which will permit 
the differential operator d to be generalized. An element co of Q fc (F), called a 
differential k-form, will assign to every point in the domain of co an element of a 
space called A k (F*), usually pronounced ‘wedge k of F*\ as a generalization of 
the dual space F*. A fe-form is what w e shall integrate to d e t e rmin e a /e - cochain. W e 
will then be able to define a differential operator d:Q k (F)-»-Q fc 1 1 (F) that will be 
consistent with d: C fc ->C k+1 for any complex. As a first step in this direction, we 
introduce the space A k (F*). 

Let F be a vector space of dimension n, and let F* be its dual space. 

The dual space, F*, is the space of linear functions from F to IR: f£(V\ IR). Let us 
now consider the space ^(F, F;IR) of bilinear functions from Fx F to IR. To say 
that such a function, t(v 15 v 2 ), is bilinear means that, for fixed v 2 ,t(v 1 ,v 2 ) is a linear 
function ofits first argument, i.e.^avj + /iw x , v 2 ) = ax{\ t , v 2 ) + fr{Wl , v 2 ). Similarly, 
for fixed v ls x is a linear function of its second argument. The space @l(v, F; IR) is a 
vector space under the usual rule for addition of functions or mul tiplyi ng a function 
by a constant. 

Within the space of functions ${y, V; IR) is the subspace, denoted A 2 (F*), of 
alternating bilinear functions. These satisfy the condition t(v 1 ,v 2 )= — r(v 2 > v i)- 
From any bilinear function <r(v 1 ,v 2 ) we can create an alternating function by the 
construction rfr,, v ? ) = qfv,, v 7 ) - ■, Vi )• 

— It is not difficult to construct elements of A 2 (F*) from elements of F*. Given 
two elements a> l ,w j eV* we can define a function co 1 a of by 

0/ A 0 ) j (\ 1 , v 2 ) = cu‘(v 1 )ay(v 2 ) - 01 ^ 2 ) 0 ^ 

Since the functions oa l and of are linear, of a of is bilinear. Furthermore, 
<*> l a ca^v^Yt) = cu t (v 2 )cQ J (v 1 ) — rn t (v 1 )rn J '(v 2 ) = — m l a co^v^V;) so that of a of is 

alternating. 

It follows from the definition of the wedge product that it is distributive with 
respect to addition: 

(co 1 + CO 2 ) A CO 3 = CO 1 A CO 3 + CO 2 A CO 3 . 

To prove this, let and v 2 be arbitrary vectors Then; 

[(co 1 + CO 2 ) A CO 3 ](v t ^ 2 ) = [(co 1 + CQ 2 )(v 1 )]CQ 3 (v 2 ) - [(CO 1 + CQ 2 )^)]^^) 

= frpfoi) + ft> 2 (vi)]ro 3 (v 2 ) - [co 1 (v 2 ) + co 2 (y 2 )]co 3 (v 1 ) ‘ 

= CO 1 A C0 3 (v 1? V 2 ) + CO 2 A C0 3 (v l5 V 2 ). 








Suppose that we have chosen dual bases: a basis for V, and a 


,1 «2 „n 


gf (e,-) = 


1 if i =j, 
0 if i =£/. 


If we express co 1 and co 2 in terms of a basis for V*, then co 1 Aco 2 = (a 1 e 1 + 
a 2 £ 2 + • ••) a (b^ 1 + b 2 er + • • •) is a sum of terms of the form s‘ a & j . However, 
e 1 a e 1 = — s l a s’. So every element of the form co 1 a oi 2 can be expressed in terms 
of the e‘ a e j with i < j. More generally, let t be any element of A 2 (F*). Then the 


i < j. Let us call t(e I -,e j ) = b iy Then A takes on the same values as r on all 
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of F*, we can construct \n(n — 1) independent elements of A 2 (F*): the elements 
s l a s j , with i < j, form a basis for A 2 (F). 

In the case where F is the space IR 3 , it is convenient to name the basis elements 




consi 


assnauraiif rasa 


dx a dy, dx a dz, and dy a dz. 

Given two elements of F*, we can use the wedge operator and the rule co* a co 1 = 

3dx a dz + dy a dx — 3dy a dz = dx a dy — 3dx a dz — 3dy a dz. 

A convenient way to express the action of an element-of A 2 ( F*) is by use of 
d e terminants : 

...a . ..2/- .. , — Jo * 1 i*i) o}\y 2 )\ 


co 1 AG> z (v 1 ,v,) = Det[ " ^ \ \ 

V >l) C 0 2 (v 2 )J 

We shall use this notation to generalize the wedge product to the case of more 
A function / of three vectors in F is called trilinear if/(v,, v 2 , v 3 ) depends linearly 
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/(▼ 3 »V 2 .V 1 )= -/(Vl,V 2 ,V 3 ). 

For instance, the determinant of a 3 x 3 matrix is a trilinear antisymmetric function 
of the columns. 
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rilinear functions from F to IR. We can construct elements for this space by using the 










There is one further algebraic operation that will be of importance to us - 
‘exterior’ (sometimes called ‘wedge ’ ) multiplication of an elem e nt of A P (K*) with an 
element of A q (V*) to obtain an element of A p+q (V*). On basis elements it is defined 
by 

(V 1 A g‘ 2 A ••• A E i p ) A (r, h A ••• A £ Jq ) = £ U A ■■■ A 8 ip A E h • • • A e jq . 

It is then extended to linear combin a tions of basis elements so as to be linear in each 
facto r separately, i.e. the distributive law for this multiplication should hold. For 
example, if 

co = 5s 1 a e 7 + 3e 2 a e 4 and a = 6s 1 a e 5 + 2e 3 as 5 — 9e 3 a e 6 
are both elements of A 2 (V*) then the element co a a of A 4 (F*) is computed as 

co a a = 51E 1 a e 7 ) a a + 3(e 2 a e 4 ) a a 

— 30s 1 a e 7 a e 1 a e 5 + 10s 1 a e 7 a e 3 a s 5 — 45E 1 a e 7 a e 3 a e 6 
+ 18e 2 A £ 4 A fi 1 A £ 5 + 6fi 2 A fi 4 A fi 3 A £ 5 — 27fi 2 A fi 4 A £ 3 A fi 6 . 

The first summand vanishes because of the repeated factor of e 1 . The remaining 
terms need some rearranging to make them into basis elements, and this may 
introduce some sign changes. For example, 

E 1 A £ 7 A £ 3 A £ 5 = E 1 A £ 3 A £ 5 A £ 7 

since the e 7 has to be moved past two factors to get it to its proper position while 

E 2 A £ 4 A E 3 A E 6 = — £ 2 A £ 3 A fi 4 A £ 6 . 

Thus 

CO ACT = 10s 1 A fi 3 A £ 5 A fi 7 — 45fi X A fi 3 A fi 6 Ajf + 18s 1 A fi 2 A £ 4 A fi 5 

— 6fi 2 A fi 3 A £ 4 A £ 5 + 27fi 2 A £ 3 A fi 4 A £ 6 . 

At this point we strongly urge you to write down many examples of exterior 
multiplication so as to get the hang of how it works. It will be one of the basic 
computational tools for the rest of the book so the effort invested now in gaining this 
com putational skill is very worthwhile. By working out many example s y ou sho uld 
convince yourself of the truth of the following rules: 

The associative law: (co a a) a t — co a (a a t). 

The distributive law: co a (a+ t) = coacj + coat. 

Anticommutativity (nowadays call e d supercommutativity): 

co a t = ( — 1) m t a cd if cdeA P (F*) and zsA q (V*). 

These rules, in turn, facilitate the computation of exterior multiplication. 

From a logical point of view, our definition is somewhat unsatisfactory in that it 
seems to depend on the choice of basis. Strictly speaking we should prove that, if we 
choose a different basis of V, and, correspondingly, different bases of A P (F*), the 
actual multiplication of elem e nts do e s not chang e . This can be don e dir e ctly, but is a 
bit tedious. A better way is to give a more abstract definition of multiplication which 




automatically satisfies the above rules so that the formula we wrote down for 
multiplication becomes a consequence of th e d e finition. We do this in the appendix 
to Chapter 18, and if you are so inclined, you can read that appendix right now. It 
can be read independently of any other material. But the important point is to gain 

some computationa l fami li arity with exterior multiplication .- 

As an example, let the vector space F be four-dimensional. With eventual 
application to spacetime in mind, we shall name the basis elements of F* as dt, dx, 
dy, and dz. The complete collection of spaces A k (F*) is then as follows: 

A°(F*) is one-dimensional: if we need to name the basis element, we call it 1. 
A^F*) is four-dimensional: it is the space F*. A basis is {dt,dx, dy,dz}. 
A 2 (F*) is six-dimensional, with basis elements dt a dx,df a dy, dt a dz, 
dx a dy, dx a dz, dy a dz. 

A 3 (F*) is four-dimensional, with basis elements dt a dx a dy, dt a dx a dz, 
dt a dy a dz, dx a dy a dz. 

A 4 (F*) is one-dimensional, with basis element dt a dx a dy a dz. 


15.2. /c-forms and the d operator 


o proceed irom me space to the space Q k (F), we now make the same 

extension as in going from the dual space V* (also called A 1 (F*)) to the space 
Q*(F). An el eme nt of Q k (F) is a function which assigns an element of A k (F*) t o 

functions. The general element of Q 2 (IR 3 ), for example, is 

where a, b, c are real valued functions. As far as algebraic operations involving 
addition and the wedge multiplication are concerned, elements of Q k (F) behave 
exactly as elements of A k (F*) do. These elements of Q k (F), called differential k-forms 
or more commonly just k-forms , will serve as our incipient /e-cochains. We must now 
learn how to differentiate and integrate them. We take F = IR" with dx 1 ,..., dx" a 
basis for F*. (We also use dx, dy, etc.) 

The differential operator d which assigns a one-form to a zero-form is already 


familiar. If /(x,y,z) is a differentiable function (a zero-form), then 


a/,-.. y,. ■ d f. 


df = — dx + — dy + —- dz 
ox dy dz 


Eventually, we want to consider a /c - form as an object which we integrate over a ‘fc- 
dimensional hypersurface’ and want to prove Stokes’ theorem which says that 




/» 


do) = 

0). 

% 
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dS 


where S is a ‘fc-dim e nsional hyp e rsurfac e ’ and dS its ‘boundary’. In Chapter 8 we saw 
that this equation, applied to infinitesimal parallelograms, forced the definition of 










the d operator as applied to one forms. We saw there that 


dg) = df a d g. 

Since every one-form is a sum of such expressions, this determined the definition of d 
on one-forms. Perhaps now is a useful time to go back and review the basic formulas 
of Chapter 8 as summarized on page 305. 

Let us now proceed more al gebraically and follow the example provided by one - 
fo r ms. We would like to have 

d (/dg a dh a • • •) = df a dg a dh a • • • . (*) 

This forces us to define the operator d acting on /c-forms as follows: 

If t = f dx* 1 a • • • a dx" 1 + g dx jl a • • • a dx jk + • • • 

then dr = d/ a dx* 1 • • • a dx* k + dg a dx J1 • • • a dx Jk v ' 

After the operator d has been applied, some rearrangement of terms may be 
necessary in order to collect the coefficients of each basis element. 

As an example of the action of d on a two-form in R 3 , let t = B x dy a dz — 
B y dx a dz +B z dx a dy where B x ,B y , B z are differentiable functions of x, y, z which 
may be regarded as the components of a vector field. Then dr = d B x a d y A dz — 
dB y a dx a dz + d B z a dx a dy. Expanding the differentials, and making use of the 

factors are the same, we have: 

jdx a dy a dz. 

For those of you familiar with vector analysis, notice the similarity between 
Tl:Q 2 (R 3 )->Q 3 (IR 3 ) and the operation div. 

There are several properties of the operator d that are useful and worth recording. 
In every case, it will be sufficient to verify the property for the case of a /c-form 
Q =fdx a dy a • • •. The general case then follows immediately by the linearity of d. 

The first property concerns the result of applying d to the product of a function 
an d a /c-form . Let / and y be differentiable functions; let Q = gdx a dy a • . Then 
d(/£2) = d(fgdx a dy a •••) = d(fg) a dx a dy. But we have the product rule 
d(/g) = /dflf + gd/ for differentials. It follows that d(/fl) = fda a 
dx a dy a —V gdf a dx a dy a • • • or d(/Q) = /d(gdx a dy a • • • ) + df a (gdx a 
dy a )•••. We conclude that 

d(/Q) - ^fdQ, + df a Q. 

The second property is a more general product rule which applies to the product 
of a p-form co and a g-form Q. Let oj — f dx 1 a ••• a dx p and Q = g r dy 1 a ••• A ^fy^ 
where / and g are differentiable functions. d(m a Q) = d{fgdx^ a a dx p a 
dy 1 a • • a dy**). So d(ca a Q)=/(dg a d x 1 a ••• a dx p a dy 1 a ••• a d y q ) + g df/\ 
dx 1 a ••• a dx p a dy 1 a ••• a dy". In the fir s t term, we now interchange dg with the 
factor dx 1 a ••• a dx p , which introduces a sign (— l) p . Then: 

d(co aQ) = (- \) p (f dx 1 a a dx p ) a (dg a dy 1 a ••• a dy p ) 

+ (df a dx 1 a • • • a dx p ) a (gdy 1 a • • • a dy p ). 





As every p-form is a sum of terms of the form /dx 1 a a dx p and every g-form 
is a sum of terms of the form gdy 1 a a d y p , w e conclude that 

d (co a Q) = (— l) p co a dQ + dco a Q. (15.2) 


The third property concerns the application of d twice in succession. We consider 
first d(df) where / is a twice-differentiable function /(x,y,...). In this case, 


/ df A df ' 

\ d 2 f 

d 2 f 

d W ) = d (j|dx + ! ; dy + ... 

= ' \ -dy a dx H 
) dydx y 

\- dx a dy + •••. 

oxoy 


From the equ a lity of mixed p a rti a l derivatives and the rel a tion dy a dx = 
_dxA dy, it follows that d(d/) = 0. Now consider the more general case of a 
/c-form co = f dx a dy a •••. In this case, dco = df a (dx a dy a •••) and d(dco) = 
df a d(dr a dy a •••) + d(d /) a dx a dy a •••. But d(dx a dy a •••) = 0andd(d/) = 
0, so we conclude that, in general, 

d(dco) = 0. (15.3) 

This is consistent with the property 

d°d = 0 


of the coboundary operator. 

Of course equation (*) now follows from (15.2) and (15.3) 


If we think of dx as 


to the coordinate 


function x, then the definition (15.1) becomes a consequence of (15.2) and (15.3). So 
— once you have convinced you rs elf that the operator d 


(1) df = (df/dx)dx+ {df/dy)dy + •••, where x, y,... are the coordinate 
functions, 

(2) d respects addition, i.e. d(co + a) = dco + d<x, 

(3) how d acts on a product, i.e. (15.2), and 

( 4 ) d°d = 0. 

From these four rules you can compute d of any form. 


15.3. Integration of /e-forms 

To complete the identification of /c-forms as incipient k-cochains we must now 
explain how a /c-form Q assigns a k- cochain to every complex with differentiable 
cells situated in an n-dimensional space. Let us first review the case k—l. Given 
a smooth curve in IR" which joins point A to point B, and a one-form co, we wish 
to evaluate co on the curve. To achieve this, we choose any smooth parameterization 
of the curve, any smooth mapping a: R -»IR" which maps the interval [0,1] into 
the desired curve, with a(0) = A and «(!) — B, as shown in figure 15.3. We then 
pull back the one - form co to obtain a *co, an expression of the form g{t)dt, and we 
define the path integral of the form co over the curve a as j a cD = jo(a*co). The 





usefulness of this definition lies in the fact that the result is independent of the 
particular parameterization, «, chosen for the curve. The strategy in this cas e was 
to reduce the integral over a curve to an integral over a much simpler region (the 
interval fO, 11). The crucial construction was the pullback procedure for a one-form: 
if co =/dx + gdy + • • • then oc*co = (a*/)d(a*x) + {a*g)d(a*y) -f —. What had to be 
proved (from the chain rule) was that the result was independent of the 
parameterization. 

The crucial ingredient in this definition, and in the corresponding theory of two- 
dimensional integration that we discussed in Chapter 8, is the notion of pullback. So 
our first order of business is to generalize the notion of pullback: Let IR" be a 

differentiable map. We wish to define an operation, *. called pullback, which 
assigns to any different i al form co (of any degree) on IR” a differential form <ft*co (of the 
same degree) on U k . We would like (f> * to preserve addition and multiplication of 
forms, that is, we would like to have 

<+ co 2 ) — 0*^1 + cj)*co 2 (15.4) 

and 

</>*(a a co 2 ) = <ft*ct?! a ef)*co 2 . (15.5) 

We would also like <ft* to be our old pullback when applied to ‘zero-forms’, that is to 
functions, and when applied to linear differential forms. More precisely, if / e Q°(R”) 
is a function, we want 

cj>*f = f (isl¬ 
and 

_ mi=mux _ (i5-7) 

The requirements (15.4)-(15.7) completely force our hand. For example, consider the 
map (j)\ U 3 —>■ [R 3 given by 


(h 

f 

ll, 


\ V 

x = r sin 6 cos i jj, 
where y — r sin 0 sin \[/ 

T 

T 

n , 





w 


W 

z = r cos 6. 






Thus 


_( fi*x = r sing cos i/j 

etc. by (15.6). Suppose we want to compute <£(dx a dy a d z). Then 

<ft*(dx a dy a d z) dx a </>* dy a <fr* dzby (15.5) 

= (d<£*x) a (d0*y) a (d<£*z) by (15.7) 

= [d(r sin 6 cos if/)'] a [d(r sin 0 sin i/Q] a [d(rcos 0)] by (15.6). 

Within each factor of this triple product we apply the usual rule for computing the 
differential of a function, so that d(r sin 6 cos \]j ) = s in 9 cos \j/ d r — r sin 0 sin \]/ d\j/ + 
r cos 9 cos \jj d6, etc. We then use the rules of exterior multiplication to express the 
^nple^roducHnjerms^fdrA^d^^d^WeJindjhat 

(f)*{dx A dy a d z) = r 2 sin 9 dr a dO A d ijr. 

(This formula is usually called the ‘expression of Euclidean volume in polar 
coordinates'.) 

In general, requirements (15.4)—(15.7) force us to define 

(f>* (/dx* 1 a dx* 2 a ••• Ad x k ) = (f° (j))d(x 1 ° (j>) a ••• a d(x*° (f>). , (15.8) 

where the x 1 (i = 1,..., n) are coordinates on R". We must then define <j>* on the most 
general form, which is a sum of such expressions, by (15.4). If we take (15.8) and its 
extension to sums as the definition of pullback, then we must go back and check that 
(15.4) -(15.7 ) actuall y hold . This is straightfor ward and w ill be left as an exercise. 
From the definition of d and the properties of pullback it follows that 

0*[d(/dx 1 a dx J a ••«)] = </>*[d/ a dx 1 a •••] 

= d{4>*f ) a d<£*x f a ••• 

= d\_((j)*f)d(p*x l a •••] 

= d[f(/dx i A-)]. 

Applying this computation to sums of expressions of the above form we get the basic 
formula 

(j>* dm = d _ (15.9) 

This result is of central importance. It says that first applying d and then pulling back 
is the sarne as first pulling hack qnd then applying d. It can he thought of as a version 
of th e chain rule. 

There is one final formula about pullbacks that we should record. Suppose that 
»A:IR p ->IR fe so that we can form the composite map (f)°if/:M p ^>M n . Then for any 
function / we know that 

(4>°\J/)*f = / o (^ o ^) = (/ o 0)°^ = ^*^*/. 

But then it follows from our rules that 

_ (ij/ ° (j))*o) = _ (15.10) 

for any differential form co. 




Let us collect the various rules of the exterior differential calculus. 

Algebraic operations: addition and multiplication of differential forms. 

the differential forms (of any fixed degree) constitute a vector space under 


the product of a form of degree p by a form of degree q is a form of degree 
P + q 

distributive law: (q^ + co 2 ) a a = oo l a o \ -co 2 a <r. 
associative law: (c^ a co 2 ) a co 3 = co 1 a ( co 2 a co 3 ). 

anticommutativity: co l a co 2 = (— 1 ) pq co 2 A a h if degco 2 = p and degco, = g 
The operator d. 

if co is a form of degree p then d co is a form of degree p + 1, 

where 

d f = (df /8x) dx + (8f /dy) dy + • • in terms of coordinates (jc, y ,...). 


d(m 1 + co 2 ) = dco l + d co 2 . 


Pullback 


= / °<ft for a function f, 

</>*(«>! + C0 2 ) = (l>*COi + (1>*C0 2> - 
<fr*(c 0 A a) = <j>*0) A $*<7, 

4>*d = d0*, 


In the above discussion it was assumed that the maps </>, \j/ etc. and the forms co, a etc. 
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various operations for forms, maps and so on which are defined on various open sets. 
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contains cf)(U) in order to be able to define the pullback cj>*co. If i{/ is only defined on 
some open set 0 then we must assume that ij/(0) cz U if we want to define ch°U'/. 


Definition of integration over regions in U k . We now want to define the integral of a Re¬ 
form over an oriented region, U, in [R fc . That is, we are considering the situation 
where the degree of the form equals the dimension of the space. We will assume that 








(See Fig. 15.4.) Then we assume that we can made the difference 

Out £ (U) - Inn g (C/) 


as small as we like by choosing e sufficiently small. This means that in computing 
Riemann approximating sums we don’t have to worry about ‘cubes along the 
boundary’. For example, it is clear that any bounded polyhedron is nice in this sense. 
Also, if U is nice and if if/ is a differentiable map defined in a slightly larger region 
than U, then the mean value theorem implies that is also nice. (You can prove 

Calculus where the basic facts about the Riemann integral in n dimensions are 
explained in detail.) We will also assume that orientation on U is the standard 

Now let Q be a /c-form on U k . Then we can write 


= / dx 1 a ••• a dx\ 


where / is a differentiable function. For any s mesh we can construct the Riemann 
approximating sum 

E/(Pi) vol (Di) = fi fc Z/(Pi) 

where p , e |~1 ,• and the [~1 ,• range over all the cubes in the mesh contained in (or having 
non-empty intersection with) U. The (uniform) continuity of / implies that this sum 
approaches a limit as & -*■ 0 independent of the choice of the p, or of the mesh. This 
limit will be denoted by 


fi. 

It is important to remember that in this definition U has the standard orientation of 
^ The following are obvious properties of the integral. Suppose that 
U=u ! u • • • u U p is a finite union of nice subregions, all with the standard 
orientation, then 
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Irrelevance of lower dimensional pieces. Let us say that a set 5 has zero content if, for 
any positive number 8 , we have Qut e(S) < 8 if s is chosen sufficiently small. (In other 
words, S has content zero if the total volume of the cubes in a mesh which intersect S 
can be made as small as we like by choosing the mesh size small enough.) For 
example, if S lies in a linear subspace. of dimension <ic(orany translatethereoi)then 
S clearly has zero content. Then sets of zero content are irrelevant as'far as 
integration is concerned. That is, if U =U'kjS where S has zero content then 
= This implies, for example, that if we divide a polyhedron up into 
subpolyhedra, we don’t have to worry about the lower dimensional faces in 
computing the integral. 

Also, 

if U is the unit cube, {0^x,1, i = 1,..., k) then 

ft/Q = fc-fold iterated integral of / = Jo '"jo/ -dx k . 

The proof is the same as in two dimensions, cf. p. 279 of Chapter 8. Of course, j ust as 
in two dimensions, any region that lends itself to iterated integration will do, not just 
a cube. 

The change of variables formula. We now come to a basic fact, the change of variables 
formula. It says: 

Suppose that (j ): IR fc -> U k is a one-to-one differentiable orientation-preserving map 
with differentiable inverse. Then for any /c-form co defined on the image space, and 
for any nice region U in the domain space, we have 



(% 



m = 



MU _. 

U 



(As usual, f need only be defined on some region slightly larger than U and ca need 
only be defined on some region slightly larger than </>(£/)•) 

Proof of the change of variables formula 

(i) Let ^4: [R fc —> (R fc be a linear map. Then for any bounded nice region 
(vol(AU)/\o[{U)) = |det4|. 

O ne begins by proving that the left-hand side of the equation in (i) is indep endent of 
JJ. This goes just as in the two - dim e nsional cas e , cf. s e ction 1.9. L e t us call the 
quotient on the left side of the equation vol (A). So vol (A) measures the proportional 
change in volume effected by A. It follows from the definition that 

vol ( AB ) = vol (A)- vol ( B ), 

and w e know from Chapter 11 that 

det (AB) = det A ■ det B. 

We wish to prove that 

vol (A) = | det A | (*) 

for all k by k matrices. From the preceding two equations we can conclude that if (*) 
is true for A and for B then it is true for AB. If A is a diagonal matrix then (*) is true by 
inspection. If the matrix A has zero entries below the diagonal, all 1 on the diagonal, 
and only one nonzero entry above the diagonal, then vol (A) = 1 (this is really the 
two-dimensional assertion about shear transformations). From this it follows that if 
A — U i s an upper triangular matrix with 1 on the diagonal(and all ze ros be low the 
diagonal) then vol (U) = det (7 = 17“Similarly for lower triangular matrices. We 
conclude that (*) is true for all matrices A which can be written as 

A = LDU 

with L lower triangular, D diagonal, and U upper triangular. When does a matrix A 
have such a decomposition'? An examination of the row reduction procedure of 
section 10.8 shows that this will happen if and only if the row reduction of A needs no 
transpositions, and this will occur if and only if non e of th e principal minors vanish 
(The principal minors are the determinants of the square matrices coming from 
taking the first r row and columns of A) Thus the conditions are 

a^i 7 *~ 0, (u ^^#22 ^ 21 ^ 12 ) ^ 0,...,det A ^ 0. 

But for any matrix A we can arrange that all these inequalities hold by making 
arbitrarily small changes in the matrix entries. Since both vol (A) and det A are 
continuous functions of A we conclude that (*) holds for all matrices. 

Here is an alternativ e proof of (*) bas e d on a diff e r e nt way of decomposing non — 
singular matrices into products. For any matrix A , the matrix A*A is self-adjoint, 
since 

(A* A)* = A* A** = A* A. 

Furthermore, if A is non-sing ular, th en the eigenvalues of A* A are all positive, since 
(A*Ax,x) = (Ax, Ax) > 0 for any non-zero vector x. Therefore A*A has a positive 










definite square root, i.e. a matrix S which is symmetric and positive definite and 
which satisfies S 2 = A*A. (Indeed, w e may write A*A = 0L0 * where L is a diagonal 
matrix with positive entries on the diagonal, and 0 is an orthogonal matrix. Then 
S = 0M0* where M is the diagonal matrix whose entries are the square roots of the 
entrie s o f L.) We claim that the ma tri x TS -1 i s o rth ogonal. Indeed 

(y4S ,_1 x,^4S' _1 y) = (S' _1 x,^4*^4S“ 1 y) = (S -1 *, S 2 S~ l y ) = (x,y) 

since S~ x is self - adjoint. Thus we may write 

A = SK 

where K = AS “ 1 is orthogonal. But S = 0M0*. So 

A = OMO*K 

where 0 (and 0* — O -1 ) and K are orthogonal and M is diagonal, with positive 
entries. For M we have vol(M) = det M. Orthogonal matrices preserve length and 
therefore preserve volume, so vol(O) = vol (K) = 1. Oh the other hand, 
det 0 = ± 1 (since det 0 -det 0* ■-= det 00* = 1 and det 0 = det 0*). This gives an 
alternativ e proof of (*). 

(ii) Let (j) = A : IR fc -* IR ,C be a linear map. Then 

^>*(dx 1 a • • • a dx fc ) = (det A) dx 1 a • • • a dx fc . (**) 

This follows from the definition of pullback and exterior product: 
(j)*x l = a^x 1 + a 12 x 2 H-b a lk x k and since the a 0 - are constants 

__ ( j)*dx l = q 11 dx 1 H- \-a lk dx k , _ 

4>* dx k = a kl dx + —b a kk dx k . 

Now 0*dx x a ••• a <p*dx k will be some multiple of dx 1 a ••• a dx fc , and this 
multiple, call it f{A), is some numerical function of the matrix A which satisfies the 
axioms for the determinant function. Hence by the uniqueness properties of the 
determinant function proved in Chapter 11, we conclude that (**) holds. 
















(iii) Steps (i) and (ii) imply the change of variables formula for the case that 0 is a 
linear map, i.e. <f> = A with det A > 0. Here is the proof: Write co = g dx 1 a • • • a dx fc , 
where g is a function. Th e n by (ii ), 

< p*co = h det A dx 1 a • • • a dx k where h(x) = g(Ax). 

Cover the region U by a mesh of small cubes, see Fig. 15.6. This has the effect of 
covering the region 0(1/) by a mesh of small parallelepipeds, and the volume of each 
parallelepiped differs from the volume of the corresponding cube by a factor of det A. 

Given any s > 0, we can choose the mesh size sufficiently small so that the function 
g does not vary by more that e on each parallelepiped: 

lff(p) - g{ q)l < e, 

whenever p and q lie in the same parallelepiped. This implies, that if □ denotes any 


one of the internal cubes, and so 0(D) the corresponding parallelepiped, 



u-g(p) vol (0(D) 

<£, 



J <£(□) 




where p is any point in 0(D). But 

vol(0(D)) = detAvol(D) by (i). 
So 
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Let U' denote the union of all the internal cube of the mesh. Then summing the 
inequalities (****) over all the internal cubes implies that 


r 


co~Y 9 (Ar) det A vol (□) 


<e, 


I J <Ka) —I— 

where r is any point in the cube □, and the sum extends over all internal cubes. As 
the mesh gets finer and finer, the cubes along the boundary can be ignored and 


i 

- 

/* - 

_ 

CO —*■ 

CO 

U(O) 


while 




< p*co. 


(iv) Now let us see what has to be modified in the preceeding argument when 0 is no 
longer assumed to be linear. First of all, 0*ct) = h det ( d(/)/8x ) dx 1 a • • • a dx* . So the 
constant det A is replaced by a function, det(d0/dx) where (<30/<3x) denotes the 
matrix-valued function 


' d(f) x I dx 1 ■■■d(j) l /dx k 
d(f> k ldx x ■ ■ • d(j) k /dx k 


(d(j)/dx) = 


The Riemann approximating sum now becomes 


Y r) det (d0/dx)(r) vol (□). 



(t 
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The inequality (***) will still hold if the mesh is fine enough, but </>(□) is no longer a 
parallelepiped, and we will not have the exact equality 

vol ( </>(□) ) = d et (d(f)/dx)(r) vol (□)■ 

But it would be enough to know that this holds approximately in the sense that 

1 vol(</>(□) — det(d<j>/dx)(r) vol(D) [ < evol(D). 

This inequality is a consequence of the mean value theorem. The details of the proof 
are not too complicated and can be found in Loomis-Sternberg Advanced Calculus 
Chapter 8, pp. 343-4. The reader is referred once again to that whole chapter for a 
comprehensive treatment of the theory of integration in U k . This completes our 
discussion of the change of variables formula. 


Integration of A>forms over k-chains in a complex. Now suppose that co is a k-form 
on H3 n . Let R" he a smooth map. Let U be a region in IR fc . Then we ca n 

consider the integral 




u 


Let K be a 


It is a map of each cell of the complex into [R n with the property that 

the restriction (j) to each c ell C of the com plex is a differentiable map. 


This makes sense because each cell C is a convex polyhedron in some Euclidean 
space. In fact, we require that the map 4> satisfies a technically slightly stronger 
condition: For each /e-cell C of the complex, there is a neighborhood U of C in [R fe and 
a differentiable map i/j: U -*■ IR" such that (f> and \J/ restrict to the same map on C. Each 
k-eell, C, of the complex is oriented. We can therefore consider the integral 


<fr*(D 


JC 


over the cell C. Let c be a k-chain. So c is a linear combination of k-cells: 


C = l r j c j 


where the Cj are k-cells and the r,- are real numbers. Then we define the integral of 
4>*a) over the chain c by the formula 



/» 



(f)*(0 = £ Vj 


J 

C (i 

Cj 


It is clear that this expression depends linearly on the chain c. Thus 


the /e-form co together with the map (j) defines a k-cochain on the complex K. 

For example, suppose that our complex K consists of the faces, edges and vertices 
of a tetrahedron, tor simplicity let us assume that the faces are oriented 
‘consistently’. By this we mean that if we take 

c = C l + C 2 + C 3 + C 4 then dc = 0. 


Integration ofk-forms 


551 



To fix the ideas, let us imagine that the tetrahedron is drawn with its center at the 
origin in U 3 . 

Independence of cellular subdivision. Let us now consider projection fro m the cent er 
of the tetrahedron onto the surface of the unit sphere, S, and call this map (f>. 

It is easy to see that 0 is a smooth map of the tetrahedron into IR 3 . Each face of the 
tetrahedron is mapped onto a portion of the sphere and we can think of 4> as 
mapping the chain c onto the surface of the sphere (with a definite choice of 
orientation). If co is a two-form in IR 3 then we can think of J c (/>*co as the integral of co 
over the sphere (with a definite choice of orientation) and write this as 


co. 


This notation implicitly assumes a number of justifications all of which are correct. 
First of all, it assumes that the boundary curves of the triangular regions into which 
we have divided the sphere are irrelevant. This is because in computing a two- 
dimensional integral, the definition via Riemann sums implies that (smooth) curves 
make no contribution. A more important assumption implicit in the notation is that 


the subdivision into triangular regions - indeed the complex K and the map 4> - are 
irrelevant. All that matters is the surface of the sphere (covered once) with a definite 
orientation. This is a consequence of the ch a nge of vari a bles theorem. Let us explain. 


Suppose we consider a different complex, say the complex L consisting of all the 
faces, edges, and vertices of a cube which we draw centered at the origin, and 
consider the map // of L onto the surface of the sphere given by projecting from the 
center. We also assume that the faces of L are consistently oriented so that 


e=F l -\-F 2 + F z + F /i +F 5 + F 6 satisfies de = 0 








Figure 15.9 












Stokes’ theorem 
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Each triangular region on the sphere is a union of several such subregions and so is 
each ‘square region’ coming from the cube. Let W be one such subregion. Then we 
can w rite 

W = 4>{U) where U is a subregion of one of the faces of K 

and - 

W = fi(V) where V is a subregion of one of the faces of L. 

Consider the map \j/: U -*■ V given by if/ = /i~ 1 °<ft. Then 

4> = n° if/ 


>*co = \J/*(u*co) 


(It is easy to check that ij/ is a differentiable map.) 


H*CD = ± 

Jv Ju 

with the plus sign if and only if i j/ is orientation preserving. It is easy to see that the 
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and only if the corresponding maps for all the others will also be orientation 
preserving. If this happens we say that c and e i nduce the same orientation on the 

We have sketched the proof of the justification of our notation for the case of a 
sphere. Of course it works in far greater generality. For any oriented surface in three- 
space (or more generally for any oriented submanifold M in IR”) the integral j M co 
makes sense and does not depend on how we cut M up into pieces so as to write it as 
the union of images of cells. 

(We have defined the integral of a two-form over the oriented two-sphere by 


i'Hiiiiiiim uiiuii HHH^BiimniHiKumviniiiiiB miiiiiiwiiiiiKiimiH ii^hiihihiibi tiivavu 


cut up the sphere. There is an alternative (and equivalent) definition which involves 
‘c 
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numbers or by changing the sign of one vector. Correct and incorrect sets for 

k= 1, 2, and 3 are illustrated in figure 15.10. _ 

The boundary of an oriented fc-cell consists of a sum of oriented (k — l)-cells, and 
we gave a general prescription for determining the sign of each cell in the boundary: 

At a point on the boundary of C, construct a set of vectors in which the first 
vector points out of the cell C and the remaining k — 1 vectors are a correctly 
oriented set for the boundary cell. If the resulting set of k vectors is a correctly 
oriented set for the cell C then the boundary cell appears with a plus sign in dC; 
otherwise it appears with a minus sign. 


k= 1 


k = 2 


k = 3 









Figure 15.12 


Figure 15.11 illustrates the application of this rule in the case k = 1. The vector 
\ l points out, and there are no other vectors. At B , the vector Vj has the correct 
orientation for the branch a, but at A it has the wrong orientation. Hence 
da = B — A. 

Th e case k — 2 is illustrated in figure 15.12 for a triangul ar cell C w ith counter¬ 
clockwise orientation. 


For each branch, the vector v 2 is chosen to lie tangent to the branch, its direction 

the cell C. For branches a and /?, the resulting pair of vectors is correctly ordered 
(according to the counterclockwise orientation of C) but for branch y it is incorrectly 
ordered. Hence dC = a + P — y. 

We return to the case where C = I k is a cube. Let us consider the various terms 
obtained by substituting (15.13) into J 5C 0* co. We consider the various faces of dC. 
We begin with the two faces for which u 1 = constant. On the face where u 1 = l, 
the vector points out and the remaining vectors v 2 ,..., v fc are correctly oriented. 
On the face A 1 where = 0, the vector v x points in, and so v 2 ,..., v k are incorrectly 
oriented. This situation is illustrated for the cases k = 2 and k = 3 in figure 15.13. 

When we come to evaluate (f)*o) on these two faces with u 1 = constant, all the 
terms in x which include a factor dw 1 give zero, and all that remains is 
flip* 1 , u 2 ,..., u k ) d u 2 a • • • a d u k . Evaluation of this term on face B ,, where v 2 ,. .. , v fc 
are correctly ordered, gives j I h-ia 1 (l,u 2 ,...,u k )du 2 ...du k . Evaluation on face 
where the vectors are incorrectly ordered, gives 

— . ,u k )du 2 .. .du k . The two terms together yield the integral 


[a 1 (l, m 2 , ..., u k ) — Q t ( O ^tt 2 ,..., u fc )] du. 2 ... d u k . 
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the fundamental theorem of calculus 


a 1 (l,M 2 ,...,u fc )-a 1 (0 


u 


u k ) = 


1 ,..i 


(tr,tr, chr 




o d u i 


and so the combined contribution of the two faces with u x = constant is 



In carrying through the same procedure for the faces where u 2 = constant, we 
note that the set of vectors v 2 , v 1 , v 3 , ..., \ k is incorrectly ordered. On the face where 


-U 2 = 1, v- 2 - pointsout,-andso:remaining vectors Y 1 v v 3 ,...,_v fc iire incorrectlyordered._ 
On the face where u 2 = 0, v 2 points in, and v\v 3 ,...,v k are correctly ordered. In 
the evaluation of x, only the term a 2 {u 1 , u 2 ,..., u k ) dn 1 a du 3 a • • • a d u k contributes, 
and we obtain 



/* 
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A similar argument shows that the faces where u j is constant contribute 
+ J I u(daj/du j ) d u 1 ... d u k , with a + sign for odd j, a — sign for even j. On summing the 
contributions from all the 2k faces in dl k , we obtain (15.12). 

This completes our proof of Stokes’ theorem for a cube. We must now consider 
more general cells. In the plane, as we pointed out in Chapter 8, every polygon can be 
decomposed into triangles. Indeed, we can decompose any polygon into convex 
polygons and any convex polygon can be decomposed into triangles by choosing a 
point in the interior and joining it to all the vertices. 

Similarly, in three dimensions, every polyhedron can be decomposed into tetra- 
hedra. Indeed, we may, after a preliminary decomposition, assume that C is convex. 
We may also assume that the faces of C have been decomposed into triangles. Again, 
just pick a point p in the interior. C will be decomposed into tetrahedra whose bases 
are the facial triangles and with apex p. Thus for two-forms, it is sufficient to prove 
Stokes’ theorem for tetrahedra. 

By induction we can do the same in k dimensions: Let Se(R k be the set defined by 
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A k-simplex is the subset of U k which is the image of S under an invertible affine map. 
Thus a 1-simplex is an interval, a 2-simplex is a triangle and a 3-simplex is a 
tetrahedron. 

Now suppose that we have C decomposed into simplices[A J. We claim that if we 
know Stokes’ theorem for all simplices then we know it for C. Indeed, consider the 
complex whose cells are all the simplices going into the decomposition of C, together 
with all their lower dimensional faces. We can then identify C with a chain in this 
complex, i.e. we can write 

C = £A ( 

and 

_ dC = Yd A t . _ 

Th erefore 


t - r 



II 

3 

(O 

J 


ctt, 

f* 

dc o since we know Stokes’ theorem for the A*. 

Ai 


do). 

Jc 


_By_the .preceding .argument,. it is enough to prove Stokes’ theorem for the case 
of a simplex. 

We will make use of the invariance of the integral under pullback to reduce the 
proof for a simplex to that for a cube. Here is the idea for the case of a triangle. 

To prove Stokes’ theorem for a triangle, it is enough to prove it for an equilateral 
triangle A since we can find an affine transformation which carried any triangle 
into an equilateral one. We may assume that the edge length of A is 1. Suppose 
the lin e ar diff e r e ntial form t w e re id e ntically zero outside a disk whos e cent e r is 
at one of the vertices and which does not tough the opposite side. 

Then dx is also identically zero outside this disk. Now as far as Stokes’ theorem is 







omp exes situated in R" 



concerned, the only contributions to the integral come from within the circle and 


1 hat is, if □ denotes the parallelogram and A the triangle, then because z and dt 
vanish outside the circle 


dr = dt 


But, up to a change of variables, the parallelogram is just a square. Hence 
J*en T = Jn d? and therefore J 5A t = J A dt. 

We will have proved Stokes’ theorem for A if we can establish the following: 
Every smooth linear differential form (defined in a neighborhood of A) can be 
wr i tten as a sum of three terms 

T=T 1 +T 2 + T 3 

where each t. is identically zero outside the disk D ; , where D, is the disk centered at 


wiicic cctcn t £ is identically zero outside tne disk wnere is tne disk centered at 
the Cth vertex and of radious R, where § < R < | (so that the three disks cover the 



f(u) = 


e 1/M u ^ 0 


is infinitely differentiable at all voints. 


Proof. Fo j u ^ o it is clear that / has derivatives of all orders and that f {k) (u ) = 0 
for u < 0. So we must prove that f {k \u)ju -> 0 as u -> 0 for any k. But 




Figure 15.16 


/ (fc) (u) = P fc (l/u)e 1/u where P k is some polynomial and hence 

lim (1 /u) ik) (u) = lim sP h (s)e~ s = 0 

u ->0 s-* oo 

since e s goes to infinity faster than any polynomial. 


0. 


Notice that the function / is = 0 for u ^ 0 and is strictly positive for u > 
Let r t denote the distance from the ith vertex and define the function g t by 


9i{x) 


f{R - r t (x)) r t (x) ^ R 


0 


r t {x) > R. 


Then g t is ii 


and 


0i(x)< 


> 0 r t {x) < R 
= 0 r/x) ^ 0. 


Let 

9 = 9l + 02 + 03- 

Since each point in A is interi or to at le ast one of the disks D t - that is - since 
^i(x) < R for at least one of 1 = 1,2,3 we know that 


g(x) > o 








break up into pieces and reduce to the cube case. 

Example 

As an explicit example of Stokes’ theorem and the evaluation of forms on cells, 

solid hemisphere of radius R shown in figure 15.17, with a right-handed orientation. 
We will evaluate hcL and f c dr explicitly. 

The solid hemisphere is bounded by two cells: the hemispherical surface A and 
the disk B in the equatorial plane. If we assign both of these cells a counterclockwise 
orientation as seen from outside, then dC = A + B. 

For the hemispherical surface, we use the spherical coordinates 0 and </> as 
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rs u = zv/n ana v = (p/zn so ma 
integration is the unit square, but in practice a rectangle is just as convenient. 









north pole 



meridian 


Figure 15.18 


The pullback of x, y and z under the parameterization a is a*x = R sin 6 cos <f>, 
a *y = Rsin6 sin <ft, «*z = R cos 6. Thus 

a*dx = R cos 9 cos 4> dfl — R sin 9 sin <j> d<fr, 
a*dy — R cos 9 sin dd + R sin 6 cos 4> d<ft, 

so that 

a*r = «*(x 2 h y 2 + z 2 )(«*dx) a (a*dy) 

= R 2 (R 2 sin 6 cos 6 cos 2 </> d0 a d (p — R 2 sin 6 cos 6 sin 2 </> d(p a d0) 

— R 4 sin 9 cos 9 d6 a d<ft . 

To check that a preserves orientation, we look at the images of the ordered set 
of vectors e x and e 2 . The images v x and v 2 agree with the orientation of the 
hemisphere A, as shown in figure 15.18. 

Now, to evaluate x on the hemisphere A, we just calculate the double integral 
J«*t = j^ o j 2 l o R 4 sin6cos0d6d(t> = nR 4 . 

For the disk B, a convenient choice of parameters consists of r — (x 2 + y 2 ) and 
the angle <ft. In order to obtain the disk B with the correct orientation we choose 
the ordering <ft, r, so that the parameter space is as shown in figure 15.19. Then 
the images of e x and e 2 have the correct ordering for the orientation of B (counter- 
clockwise as seen from below - clockwise as seen from above). 

This parameterization P is specified by the pullback P*x = r cos (f>, P*y — r sin (p, 
P*z = 0, so that P*x = r 2 (cos <p dr — r sin (p d(p) a (sin (p dr + r cos (p d(p) or P*x = 
r 2 dr a d(p = — r 3 d (p a dr. Thus the value of x on the oriented disk B is 
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— r 3 d<pdr = — \ 7tR 4 . 

* * 

r = 0 , 

0 = 0 


Combining the results for the two cells in the boundary, we find 
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Now we evaluate jcdr. Since x = (x 2 + y 2 + z 2 )dx a dy ^ dr = 2z dz a dx a dy — 
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2 z dx a dy a d z. We parameterize the cell C by using spherical coordinates, r, 6, 0. 
The pullback under this parameterization 0 is \J/*x = r sin 0 cos 0, \J/*y = 
r sin 6 sin 0, i//*z = r cos 6 and, after some calculation, we find 0*(dt) = 2r 3 sin 0 x 
cos 8 dr a d# a d0. To determine the value of dt on C we integrate over a rectan¬ 
gular solid in parameter space: 
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Let us now briefly indicate how we can use the proof that we gave of Stokes’ theorem 
to suggest an alternative definition of integration of a k-form over a /c-dimensional 
submanifold. This alternative definition will coincide with the definition of 
integration over a /c-chain that we gave above in the case that the submanifold is 
given as the image of a k-chain. This will then sketch out in more detail how to prove 
the fact that th e int e gral ov e r a k-chain is indep e nd e nt of the cellular d e composition, 
as we illustrated at the end of the preceding section for the case of a sphere. The 
material presented from here to the end of this section can be omitted on first 
reading. 

Patching 

In our proof of Stocke’s theorem we made use of a collection of functions 
(0i,..., 0„} with the property that each of the 0s was continuously differentiable, 
non-negative, and 

0i + —h 0„ = 1 in the region of interest (the simplex). 
Furthermore each 0 t - vanishes outside a ball D t where we choose the ball D t to have a 





either f t or fj. So n Dj) is an open subset of U t and fj{D t n Dj) is an open subset 

of Uj and 

fi 0 / 1 maps fjiDinDj) onto /^nDj) 

and carries 

Wjnf^PtnDj) onto W , ,n/ i (D,nD / ). 

S o o u r s i tuation is as follows: We have covered M with sets 0 X =fl x nM. Let g t 
denote the restriction of f t to 0 t . Then g t is a one - to - one map of O t onto the open 






subset W t of R*. Furthermore, the map 

9a = 9i 0 9j 1 of Wj n g J {0 i n Oj) onto W t n giO t n Oj) 

is differentiable with differentiable inverse given by g jt = g j °gi~ 1 - 

We can think of the manifold M as being covered with ‘patches’: The maps g [ 1 tell 
how the W t cover the manifold, and the maps g tJ tell how the W t and Wj patch 
together. Now each map g {j can be either orientation preserving or reversing. 
Suppose that we are in a situation where all the g u a r e orientation preserving. (This 
then determines an ‘orientation’ on M. It is intuitively clear that if M is connected 
there are then only two orientations.) Now let 0 1? ..., 4> r be a collection of function s 
as above so that 

4>i = 0 outside of D t and - I-^eI onM. 

Now for any k-form c o w e can write 

(D = + ••• + (j ) r co . 


Each summand on the right vanishes outside D t . We would then define 
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In other words, instead of cutting M up into pieces, we write any form as a sum of 
small pieces each of which lives only on one coordinate patch on M which we can 
then integrate by pulling it back to a subset of U k . A repeated use of the change of 
variables form ula easily shows that this definition of integral does not depend on the 
choice of the (f) t or of the patching (i.e. of the choice of O t and g,) but only on the 

of M. For further details on this definition we refer once again 
Calculus by Loomis and Sternberg, Chapter 9-12. 



15.5. Differential forms and cohomology* 

The k-forms which we have defined will function as incipient /c-cochains for any 
complex situated in R", and the differential operator d is consistent with the 
coboundary operator d. Corresponding to the subspaces which were defined in 
terms of the image and kernel of the coboundary operator, we can now construct 
subspaces of the (infinite-dimensional) spaces Q (fc) (R") by using the differential 
operator d. 

If a k-form <o satisfies d co = 0, it is called closed. Such a /c-form becomes, by 
integration over cells, a k-cochain W which satisfies dW= 0 and therefore belongs 
to the space Z k of k-cocycles. In other words, a closed /c-form is an incipient 
k-cocycle. 

If a /e-form co can be expressed as co = dr, it is called exact. In this case, by 
integration over the cells of a complex, i gives rise to a (k — l)-cochain T, oj to a 
k-cochain W, such that W=dT. The cochain W therefore lies in the subspace B k 

* Can be omitted on first reading. 









of fc-coboundaries. In other words, any exact k- form is an incipient coboundary. 

For any complex, we know that B k is a subspace of Z k . The corresponding 
statement about k- forms is that any exact k-form (incipient element of B k ) is also 
closed (incipient element of Z k ). The proof is simple: if co is exact, co = dt and so 
du) = d(dr) = 0 and a) is closed. _ _ 

For any complex, the quotient spaces (cohomology spaces) H k = Z k /B k depend 
just upon the underlying space and not upon how it is cut up to form a complex. 
We. might reasonably expect, therefore, to obtain the spaces H k by considering 
differential forms: we take the quotient of the infinite-dimensional space of closed 
forms (dco = 0) by the space of exact forms (co = dr). The resulting spaces define 
the so-called de Rham cohomology of the underlying space on which the differential 
forms are defined. 

As a clu e in constructing a basis for the spaces H k , w e r e call that, for any 
complex, the space H k is dual to the homology space H k . It is therefore reasonable 
to expect that a basis element for H k will be determined by its values on the 
/c-chains which form a basis for H k in any convenient complex in the space. 

Let us consider some extremely simple examples which illustrate these rather 
abstract considerations. 

Example 1A. The underlying space is a single line segment. On the space we 
construct zero-forms, which are differentiable functions f(t), and one-forms, which 
are all of the form co = g(t) dt. 

In t his case , th e close d zer o-forms are the ^constan t functions^whi ch define a 
one-dimensional subspace. There can be no exact zero-forms, so the quotient space 
H° of closed zero-forms by exact zero-forms is one-dimensional. This reflects the 
fact that no matter how we cut up the line segment to obtain a complex, the 
complex is always a connected one, with dimH 0 = 1. 

On considering one-forms, we discov e r that ev e ry on e -form is e xact: giv e n co = 
g(t)dt we form an antiderivative G(t) = jg(t)dt, so that co = dG. Hence H 1 is zero- 
dimensional in this case. 

Example IB. The underlying space now consists of two disjoint line segments. In 
this case the space H° = Z° is two - dimensional: it consists of functions which 
have one constant value on the interval [a,b] and another, possibly different, 
constant value on the interval fc, <T\. This reflects the fact that a complex constructed 
on this space will have two connected components. 


a bt 

Figure 15.21 


^ b e eL 

Figure 15.22 




Figure 15.23 


More generally, whatever the dimension of the underlying space, the closed 
zero-forms will be functions which are constant on each connected component, 
and these will form a space H° = Z° whose dimension equals the number of 
connected components in the underlying space. From now on, we shall consider 
only connected spaces, and we need say nothing mor e about H°. 

Example 2A. The underlying space is a rectangle in the plane. Because the space 
is connected, H° is one-dimensional. On considering one-forms, we note that any 
closed one-form co is also exact, a result which we proved when considering line 
integrals. The space H l is therefore zero-dimensional. 

Looking now at two-forms, we note that every two-form is exact. The most 
general two-form can be expressed as x =f(x,y)dx a d y. We form a function 




F(x, y) = 

* 

f(t, y)dt 

0 


“so that 


dF 

and then 

d(Fdy) =/(x, y) dx a dy = x. 

We conclude that H 2 is zero-dimensional. 

Example 2B. The underlying space is again a rectangle in the plane, but with one 
point, which we take to be the origin 0, deleted. Again. H° is one-dimensional 
because the space is connected, and H 2 is zero-dimensional because every two-form 






■ 0 









Figure 15.24 
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is exact. When we turn to one-forms, however, a new phenomenon arises: there 
exist one-forms which are closed but not exact, so that H 1 is not empty in this case. 


We can guess a basis for H 1 by first consid e ring th e homology space H t for a 
two-complex situated in this space, as shown in figure 15.25. Because the origin 
not part of the space, there can be no cell in the complex which includes the 


is 


origin. As a result, a cycle like « + ft + y that encircles the origin cannot b e a 
boundary. The equivalence class of such a cycle (modulo the space of boundaries) 
forms a basis for the one-dimensional space For the same complex, the coho- 
mology space H 1 is dual to H v Its basis element will be the equivalence class of 
a one-cocycle W which is not a coboundary. Because H 1 is dual to H ls we expect 
W to have a fixed non-zero value on any cycle such as a + jS + y which encircles 
the origin once in a counterclockwise sense. To find a basis for the de Rham 
cohomology space H l , we must discover an incipient cocycle for W: a closed 


one-form m 0 with a fixed non-zero value on any curve which encircles the origin 
once. We can obtain such a one-form by considering the function 9 = arctan (y/x), 
which is defined everywhere except at the origin and which can be mad e continuous 
except on one line proceeding outward from the origin (usually the negative x-axis, 
so that — n < 0 ^ n). We then form the differential 


qj 0 = d6 


xdy — y dx 
— x 2 + y 2 —- 


This one-form is defined and continuous everywhere except at the origin. It is 
c l osed : doj Q = 0, as you can verify by direct computation. On the other h a nd , it 
is not exact: there is no continuous function /(x, y) such that aj 0 = df. For any 
curve « which encircles the origin n times in a counterclockwise sense, 


t* 

% 

o) 0 = 2nn. 

* 

a 


A basis for H 1 in this case is therefore the equivalence class 

__ x dy — y dx 
^0 = ~^2~p-p— + d 0 

where g(x, y) is any differentiable function. 





Figure 15.26 


Example 3A. The underlying space is a rectangular parallelepiped region in U 3 . We 


is one-dimensional (the region is connected); 


H 2 is {0} (every closed two-form is exact); 

H 3 is {0} (every three-form is exact). 

The results for follow as consequences of a general theorem, known 

as Poincare’s lemma, which states, loosely speaking, that, on any space that can 
b e co n tinuou sly contra cted to a ppoin t, every cl osed differential form is also exact. 
The interesting cases are those in which the space has ‘holes’ and so cannot be 
contracted to ^/point. Of course, these are also the cases in which there exist cycles 
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Example 3B. The underlying space is a rectangular region in IR 3 with the origin 
excluded. The results for and H 3 are unchanged, but now H 2 is one- 

dimensional. The differential form 


a dz + y dz a dx + z dx a dy 
Io_ (x 2 + y 2 + z 2 ) 3 ' 2 

is defined everywhere except at the origin. It is closed, as you can verify by a 
rather tedious computation. However, J s r 0 = 4n for any closed surface S which 
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and so it is not surprising to find it belongs to the equivalence class which is dual 
to H 2 . 

Example 3C. The underlying space is a rectangular region in IR 3 with the z-axis 
excluded. In this case H 1 is one-dimensional, with basis 




again 



which measures how many times a closed curve encircles the z-axis. The other basis 
element is 


(R — l)dz — zdR , , , ; ; 

0)1 = Z 2 + (« - 1)“ + dg R = + 31 
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winds around the unit circle in the xy-plane. Thus the basis elements a) 0 and 






now prove Poincare’s lemma by giving an explicit prescription for constructing a 
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The theorem will be proved for the case where a) is defined on an open set in R" 
which has the property that each point in the set can be joined to the origin by 
a straight line which lies entirely within the set. Such a set is called star-shaped. 
Later we can easily extend the proof to the case of a set which can be put into 
one-to-one correspondence with a star-shaped set by a smooth mapping. Examples 
are shown in figure 15.28. 








Star-shaped relative to 0 Region on surface of sphere - 

(though not with respect to p). not star-shaped, but the image 



(xA 


of p ar e 


, th e lin e segment is sp e cifi e d param e trically by 


\x n J 



A,i\ 



w 

= 

K tx n j 

O^t ^ 1 . 


In-terms of pullback, wedrave ^ 

Any function g defined on the region Q can be pulled back to the region Q x [0,1]. 
For exa mple, the function gju 1 , ..., u tl ) pulls back to the function P*g = gjtx 1 ,..., tx n ) . 
This means, in effect, that p*g is a function of p = (x 1 ,..., x") and t which assigns, 
as t ranges from 0 to 1, the values assumed by g along the line joining the origin 
to p. Of course fi*g is also a function of the coordinates of p; if p is changed, we 
look at values of g on a different line segment. Study figure 15.29 until you visualize 
the significance of fi*g. 



By the usual rules, we can pull back any basis one-form, 

P* du‘ = d (tx l ) = x l d t + t dx 1 , 

of any fc-form. Given a k-form co defined on Q, we can write it in terms of the 

coordinates tf 1 

(o = Y a il ' ik (u 1 ...u n )du il A---Adu lk . 

i i < ■■ ■ < i k - 

On pulling back such a form we obtain 

P*cd = Y a h...i k {tx l .tx n )(tdx 11 + x 1 ’ 1 dt) a ••• a {tdx ik + x‘ k dt). 

t'l < < ‘k 

This fc-form p*co is a sum of terms of the form x 1 {t) = A(t, x 1 ,..., x") dx 1 ’ 1 a • • • a dx ik 
or Zj(t) = Bit, x ,..., x ) dt a dx M a • • • a dx 1 * -1 . We now define a linear operator L 


Li, = 


B(t,x 1 ,...,x n )dt dx 11 a ••• a dx' 4 - 1 . 


This operator has the remarkable property that 

dLx 1 + Ldx 1 = t ! (1) — t^O). 
dLi 2 + Ldi 2 = 0. 

This proof is a matter of str ai ghtforw ard computation. 
We start with x v Of course dLx 1 = 0: Also 


so that 


dXj = —- dt a dx 11 a • • • a dx tk + terms with no dt 
dt 


Ldx, = — dx* 1 a • • • a dx Ik 

Jo St 


.1 y n \\ Hv 1 ' 1 A ••• A Hy 1|! 



Lx-, = 

( 

Bit.x 1 . x n )dt \ 

|dx M A ••• A dx tk ~‘. 




d 

o / 




which is not a function of t. In forming dLx 2 , we may differentiate under the 
integral sign to obtain 




or 


dr 2 = — Yj a dx j a dx' 1 a • • • a dx lk ~' 

j— i dx 


so that 


n 

j j _r i 

r 

r sb i ^ 


Ld t 2 — 2^! 

j= 1 

u 

0 fix’ ) 

| dx J a dx 1 a • • • a ax k 1 


and we conclude that dLr 2 + Ldi 2 = 0. _ 

It follows that if we consider dL(p*co) + Ld(/?*co), we can ignore all the terms 
of type t 2 , which involve a factor of d t. As long as k > 0, we have r^O) = 0 because 
of the factor of t which accompanies each dx\ Thus, setting t = 1, we obtain 

dL(ff*co) + Ldff*co = ^ a,-, ,„(x 1 ,...,x")dx tl a ••• a dx ik = co. 

i i < • • • < ik 

But we know that dfi*co = fi*dco, so we have 

d LP*a> + Lp*dco = co. 

Thus the linear operator S = Lp* satisfies the identity 

dSco + Sdcu = co. 

If co is closed, so that dco = 0, we have 

dSo = co 

and we have proved that co is exact. 

So far we have assumed that co is defined on a star-shaped region Q. We now 
extend the above result to the case where co is defined on a region *F( Q ) which is 
the image of a star-shaped region under a smooth one-to-one mapping 'P. If co is 
closed, so is since d'F*cu = 'P*dct) = 0. We therefore can write v P*co = d(S'P*co). 
Now, since is invertible, we can apply the inverse pullback ('P*)“ 1 to obtain 

oj = (T / *)d(S'P*cu) 
or 

o) = (y*)- 1 S4 / *o). 

Thus, for a form co defined on a region D , we have a whole family of antiderivative 
operators, (T*)" 1 ^^*^, corresponding to various ways of expressing D as the 
image of a star-shaped region: D = Another way of looking at this state of 
affairs is to note that we may introduce various coordinate systems on D in such^ 
a way the region in coordinate space which gets mapped into D is star-shaped. 
The operator then corresponds to integrating along straight lines in 

coordinate space. If D is the surface of the Gulf of Mexico, for example, then we 
form S by integrating along straight lines joining points of D to the center of the 
Earth (or oth e r origin). By introducing sph e rical coordinat es, we construct a 
different antiderivative which corresponds to joining each point to the 

origin by a path which appears straight when drawn on a Mercator projection map. 
Here is a summary of the procedure for forming an antiderivative of a differenti al 





form 


Oi = £ a ii...k(x 1 ...x")dx* 1 a ••• a dx tk . 

«1 < •• < ik 

Step 1- F° rm the pullback /?*co by making the replacement x‘ -> tx* in the arguments 
Q f a U the coefficient functions, so that 

afx 1 ,...,x ")-*■ a(tx\ tx n ) 

a nd make the replacement 

dx 1 -> x‘dt + tdx 1 . 


Step 2. Throw out all terms which do not involve dr. Move d t to the left in all other 
terms, keeping track of signs carefully. 

Step 3. Treat the d t as in an ordinary integral, and integrate over t from 0 to 1. The 
following examples show the procedure in action: 



co = y 2 dx + 2xydy (a closed one-form). 


Step 1. /?*co = t 2 y 2 (xdt + tdx) + 2t 2 xy(ydf + tdy). 
Step 2. P*o) = ( t 2 xy 2 + 2t 2 xy 2 ) dt + other terms. 


Check. d(xy 2 ) = y 2 dx + 2xydy. 


Example Z ox— sinxdxAdy, 




Step 2. j 3*co = xt sin tx dt a dy — yt sin tx dt a dx + term without d t. 
Step 3. Sco = (xdy — ydx) J* t sin xt dt 


= (xdy — ydx) 


sin x — x cos x 


sin x — x cos x \ / sin x — x cos x \ 

- d y-y\ -2- ) dx 


or 



( x cos x — sin x ) 

i , , sinx 

Sco = — cos x dy + y | 

l--**-- 

| dx -|-— dy. 


The answer differs from the ‘obvious’ antiderivative — cos x dy by the exact one- 
form — 

J y . \ x cos x — sin x , sin x , 

d — sin x I =- 2 - y dx H-dy. 

V X j 3C X 

Example 3. /Pco = (y 2 — x 2 )zdx a dy + (x 2 — z 2 )y dz a dx + (z 2 — y 2 )x dy a dz. 
Step 1. P*(D = t 3 (y 2 — x 2 )z(xdt + tdx) a (ydt + tdy) 





+ t 3 (x 2 - z 2 )y(zdt + tdz ) a (xdt + tdx) 
+ t 3 {z 2 - y 2 )x(ydt + tdy) a (zdt + tdz). 

Step 2. p*co = t 4 (- (y 2 - x 2 )zy + (x 2 ~ z 2 )yz) dt a dx 
+ ((y 2 -x 2 )zx - { z 2 - v 2 )xz) dt A dv 
+ ( - (x 2 - z 2 )yx + (z 2 - y 2 )xy) dt a dz 
+ terms without dt. 


Step 3. Sco = Jq t 4 'dt(2x 2 yz — y 3 z — yz 3 ) dx 

+ (2y 2 zx —x 3 z— xz 3 ) dy + (2z 2 xy — xy 3 — x -y)d-z- 


or 




We have just shown that the kernel of S\ Q fc -»Q k-1 is a subspace of the image 
of S:Cl k + 1 Indeed, it is the entire image. Suppose, for example, that a /c-form 
satisfies So = 0. Then, since co = Sdco + d5co we know that co = S dco so that co is 
in the image of S. To summarize: Sco = 0 if and only if ct> = S(f> for some <j>. 


Summary 

A Exterior algebra and calculus 

You should be .able lQ_define_ the spaces A k (Y *) jQr uii Ji-dimensional vector spacu 
V, and to write down a basis for each of these spaces. 

You should be able to state and apply the properties of the d operator for 
differential forms of arbitrary degree, including its relationship to pullback. 

B Integration of differential forms 

You should be able to evaluate the integral of a k-form over a k-cell which is 
expressed as the image of the unit k- cube. 

You should be able to state and apply Stokes’ theorem and outline a proof of it. 

C Differential forms and cohomology 

You should know how to construct cochains by interaction of differential forms 

and be able to identify forms that define basis elements of H k for 2-complexes or 

3-complexes. 

Given a differential k-form </> with d(p = 0, defined on a star-shaped region, you 
should be able to construct the (k — l)-form Sf with the property d S(f> = 4>. 


Exercises 


15.1(a) Let t(v 1? v 2 , v 3 ) be an alternating trilinear function (i.e., a three-form). 
Without invoking properties of determinants, prove that, if v,, v 2 , and v 3 
are linearly dependent, then t(v 1 ,v 2 , v 3 ) = 0. 








(b) Let co 1 , a) 2 , co 3 be three elements of V*. Without invoking properties of 
determinants, prove that, if oj\oj 2 ,g> 3 are linearly dependent, then 
co 1 a co 2 a co 3 = 0. 

15.2. In four-dimensional spacetime let E be the two-form E = dt a (E x dx + 
Eydy + E 2 dz). Let B --= B z dx a dy - B y dx a dz + B x dy a dz. Calculate 
E a B. 

15 Suppose we define the determinant of a linear tr a nsformation A: IR" -»IR" 
-by 

Det A = dx 1 a dx 2 a • • • a dx"[ J 4e 1 , Ae 2 , ■■■, Ae„~\ 

where {dx^dx 2 ,...,d x"} are dual to {ei,e 2 ,...,e„}._ 

(a) Prove that if A is the identity matrix, Det A = 1. 

(b) Prove that if A* denotes the adjoint of A, then 

Det A =A*( b 1 a A* d x 2 a ••• a /4*dx"(e 1 ,e 2 ,...,e n ). 


15.4. Evaluate the following determinants by using the results of Exercises 
15.3(c), (b): 



( 2 3^ 


7 2\ 


< - 1\ 


( 3\ 
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(a —b 0 0^ 


(b) Det 

b a —b 0 

0 b a — b 

• 

U b a) 



15.5. Let A 3 (F) denote the space of all alternating multilinear functions from 
V x V x V to IR. Suppose that V is four-dimensional, and let e 1 , e 2 , e 3 , e 4 be 
a basis for the dual space V*. Using the wedge notation, write down a basis 
for A 3 (F) and write a formula for the action of one basis element on a 

- triplet of vectors (v 1 ,v 2 ,v 3 ). 

15.6. Let co=fdx + dy,t = gdx + dz where / and g are differentiable functions 
of x, y, and z. Calculate d(co a t), expressing your answer as a multiple of 
dx A dy A dz. 

15.7. Let j8 be a differentiable mapping from an affine space A (dimension m) to 
- B (dimension n) . Let x(q; w) be a one-form on B, where q is a point of B, w a 

vector at that point. Then dx may be defined as 

dt(q; w 1? w 2 ) = best linear approxim atio n to 

x(fj + w J , w 2 ) - dq, w 2 ) - t (<7 + w 2 , w j + z(q, w,). 

The pullback of a one-form is defined by 

{p*z){q;v) = z(Pq;dP(\)) 

while for a two-form a, 

Vi, v 2 ) = oijiq; d/i(v,), d/?(v 2 )). 

(a) Prove directly from the above definitions (and the chain rule, if you 
need it) 


dfi*T = p*(dx). 




(b) Introduce affine coordinates w 1 ,..., u m on A and x 1 ,..., x" on B. Then 
P may be specified by t he differentiable pullback functions 

P*x l =FV.«*), 

ft*x 2 = F 2 (u\. ..,u m ), 


P* x n _ 

while 

n 

t = Yj G'Cx 1 ,..., x n )dx j 

-i=i- 

where the functions G are differentiable. Calculate dj8*t and /?*(dr) 
explicitly and show that they are equal. 

(c) Let t and X both be one-forms. Show that if Q = t a X then /?*Q = 
(P*T)A(P*l). 

15.8. Sph e rical coordinates r, 6, <f) and Cartesian coordinat e s x, y, z are relat e d 

ot*x = r sin 6 cos <fi, 
a *y = r sin 6 sin <^, 
a *z — r cos 9 


where a is the mapping which carries the point 


in spherical 





coordinate space into the point y whose coordinates are r, 0, and <fi. 


Calculate the pullback of the following differential forms (i.e., express them 
in spherical coordinates). 

(a) a*(xdy — ydx); 

xdx + ydy + zdz 

( b ) . ■>.£ £! 


(c) a* 


(x 2 + r + z 2 ) 

xdv Ad z + vdz a dx + zdx a dv 


(x 2 + y 2 + z 2 ) 3/2 


(d) a*dx A d y A dz. 


15.9(a) Let/be a differentiable function on 1R 3 . Let A be the two - form 

df df df 

A — — dy a dz + — dz a dx H-dx a dy. 

dx dy dz 

Show that dd = A/dx a dy a dz, where A denotes the Laplacian 

d 2 /dx 2 + d 2 /dy 2 + d 2 /dz 2 . 

(b) Let F x ,F y ,F z be differentiable functions on IR 3 which satisfy 
- 8F r dF, 


dx dy dz 


■■ 0. Define 








B = 




dx + 




djH- 




dz. 


Show that d B = AF x dy a dz + AF y dz a dx + AF Z dx a dj/. 

15.10. A solid of revolution about the z - axis is bounded below by the disk 
x 2 -t- y 2 1, z = 0, on the side by the cylinder x 2 + y 2 = 1, 0 < z < 2, and on 
top by the paraboloid z = 1 + x 2 + y 2 described parametrically by 

(i*x — r cos 6, 


P*y = r sin d, 
fi*z = 1 + r 2 . 

All these surfaces are oriented counterclockwise as viewed from outside, 
as shown in Fig. 15.30. 

(a) Evaluate the integral of the two-form t = xdy a dz — ydx a dz over 
the paraboloidal top surface. 

(b) Evaluate the integral of t over the cylindrical surface. 

(c) Use Stokes’ theorem to express the volume of the solid in terms of the. 

integrals which you have just calculated._ 

15.11. Let Q = xdy a dz. 

(a) Evaluate [Q over the disk x 2 + y 2 < R 2 , z = 0. 

(b) Consider the hemisphere x 2 + y 2 + z 2 — R 2 , z ^ 0, which can be 
parametrized by spherical coordinates as follows (see Fig. 15.31): 


z R cos 0 , 

x = R sin 0 cos 
y = R sin 6 sin 


—Expressfl intermsoftheconstantRandthecoordinates0and0;and 
thereby evaluate |Q over the hemisphere described above. 

(c) Use Stokes’ theorem to write down an integral of a three-form which 
must equal the difference of the two surface integrals in (b) and (a). 
State the geometrical significance of this integral, and thereby 
evaluate it by inspection. 



Figure 15.30 





Figure 15.31 


15.12(a) Consider a solid of uniform density p which occupies a region Q in U 3 . 
Show that the integral of the two-form 

r = j(x 3 dy a dz — y 3 dx a dz) 

over the boundary of Q gives the moment of inertia I zz for the solid. 

(b) Invent a two-form a whose integral over dQ gives the product of inertia 

(c) Use the result of part (a) to calculate the moment of inertia of a sphere of 
radius a about a diameter. 

Hint : Use sphe r ical coo r dinates. 

15.13(a) Let W be the region in M 2 occupied by a solid body of uniform density p. 
Show that the z-component of the center of mass of the body, z, can be 

_ determ i ned as a q uot ient of two surface inte grals evaluated over the 

boundary of the body, d W, as follows: 




— V 

p 2 j ua A u y 

8 W 


zd x a dy 

where r 2 = x 2 + y 2 + z 2 . 

(b) Spherical coordinates are defini 
a*y = r sin 0 sin (p, a*z — r cos 0. C 

aw 

;d by the pullback a*x = rsin0cos</>, 

Calculate a*(dx a dy). 


(c) Using the result of (b), evaluate 


\r 2 dx a d y 

J 8 W 

for the case where W is the interio r of a he mi spher e of radius a, with a 
right-handed orientation. 

15.14. Let C be the half-cylinder bounded by the planes z = 0 (Bottom) and z = 1 
(Top), the plane x = 0 (Flat) and the surface x 2 + y 2 = a 2 (Curve). C has a 
right-handed orientation, so that e x ,e y , e z , as shown in figure 15.32, are a 
correctly ordered basis. Each face in the boundary of C has been given an 
ordered basis, as shown, which define s its o rientation. 

(a) Write down an expression, with appropriate signs, for dC. 

(b) Let t be the two - form t — x 2 dy a dz — 2xzdx a dy. Calculate the 
pullback of t und e r th e mapping defin e d by f}*x = acosu, 







Figure 15.32 


fi*y = a sin u, j3*z = v, and thereby calculate the integral of x over the 
face Curve. 

(c) Using the mapping defined by <x.*x — rcosd,a*y = rsin9,oc*z — 1, 
evaluate the integral of t over the face Top. 

(d) What is the integral of t over the other two faces? Explain. 

(e) Calculate dr. Using Stokes’ theorem, explain how this result provides 
a check on your answers to parts (b), (c) and (d). 

15.15. Consider the two-complex shown in figure 15.33. 

(a) Let Wbe a one-cochain. Express the two-cochain dW in terms of W; 

i.e.. express J 5l dVF and J 52 d W in terms of W a , _ 

(b) Consider the one-form co — yd x + xdy = d(xy). Calculate the one- 
cochain IVdetermined by co by integrating co over the branches. Verify 
that dW — 0. 

(c) Do the same for the one-form co = (xdy — ydx)/r 2 . Show that it 
determines a cochain W which is a cocycle but not a coboundary. 
Would W still be a cocycle if S 3 , the interior of the circle bounded 
by a and e, were included in the complex? 

15.16. The unit cube shown in figure 15.34 determines a three-complex with six 

fac e s: 






Each face is given a counterclockwise orientation as seen from the outside, 
so that th e int e rior of th e cube, R, has as its boundary dR = + S 2 H— 

S 3 4- S 4 + S 5 + S 6 . 

(a) Let T be any two-cochain. Find an expression for d T. 


11 »)■ j jCT tv/ip IMgjj i w 


truct the corresponding two-cochain T by integrating x over each face. 
Evaluate dt and construct the corresponding three-cochain: check 



X 















0! 



(c) Let co be the one-form co = x 2 ydz. Compute d co. Evaluate the one- 
cochain W corresponding to co on branches a, /}, y, 6,e, rj. Evaluate the 
two - cochain T corresponding to doj on faces S 2 and S 3 . Check that 
T, e valuat e d on S 2 + S 3 , e quals W e valuat e d on the boundary of 

^2 + ^ 3 - 

15.17(a) Let t be a two-form on R 3 : t = adx Ady + bdx a dz + cdy a dz, where 
a, b, c are differentiable functions of x, y, z. Let cj) be a mapping from 1R 2 
(coordinates u, v) to R 3 (coordinates x, y, z) defined by 

4>*x — F(u, v), 4>*y — G(u, v ), cp*z = H(u, v ). 

Verify by explicit computation that cj)*(dx) = d((j)*T). You will of course 
have to use the chain rule. 

(b) Let v j , v 2 be vectors in R 2 . Then the pullback cj)*x can be defined by 

, V 2 ) = TCd^Yt), d <ft(v 2 )] 

—(remember, d^-is represented at each point by a 3 x 2 Jacobian matrix>.— 
Use this definition and the chain rule to prove that 0*dt = d(0*x). 

15.18. Using the 5-operator described in section 15.4: 

(a) Find a function f(x,y,z ) such that 

df = ye^dx + (x + xyz )e yz dy + xyV z dz. 

(b) Find a one-form co such that 


d co — xdy a dz + ydx a dz. 


(c) Find a two-form x such that 


dr = xy 2 z 3 dx a d y A dz. 


15.19. Let B = z 2 dx a d v + vzdx a dz — xzd v a dz. 

(a) Let W be any region in R 3 . Show that \ 8W B = 0. 

(b) Find a one-form A such that dA = B. 

15.20. Consider the two-complex shown in figure 15.35, with four nodes, five 
branches, and one two-cell S, which is shaded in the diagram. Let 


-V 


co = dO 


xdy — ydx 
x 2 + y 2 


Determine the one-chain W which corresponds to co. (Calculate 
W a . W p . W y . W 6 , W E . Show that IF is a cocycle but not a coboundary. 

15.21. Let R 1,3 denote four-dimensional spacetime, with affine coordinates 
ordered t, x , y, z. - 

(a) Write down a basis for A 2 (R 1,3 ). 




(b) Consider the two form 



Ifdf^= 0, what relation among the partial derivatives of A, B, and C 
must hold? 
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Chapter 16 is devoted to electrostatics. We suggest that the 
dielectric properties of the vacuum give the continuous 
analog of the capac itance of a network, an d th at these 
dielectric prope rties are wh at det ermine Eucli dea n geometry 
in three-dimensional space. The basic facts of potential theory 
are presented. 


16.1. From th e discrete to the continuous 


We let A 0 = fi°(R 3 ) denote the space of smooth functions on R 3 (thought of as forms 
of ‘degree zero’); we let A 1 = Q 1 (R 3 ) denote the space of smooth linear differential 
forms, A 2 denote the space of smooth forms of degree two and A 3 the space of smooth 
forms of degr ee three. We hav e b e en thinking of A‘ as incipient cochains of degree 
i; that is, if we are given a complex in R 3 , each form of degree i defines a cochain of 
degree i by a process of integ r ation. We thus have been thinking of i-forms as rules 
which assign numbers to chains. We can turn this picture around. If we fix a definite 
chain c of degree i, then we can think of c as defining a linear function on the space A 1 : 
to each form co of degree i we assign thejiumber If co l and co 2 a re two forms of 
degree i and if a and b are real numbers, then it follows from the properties of 
integration that 
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(aa)! + bco 2 ) = a 

+ b 

co 2 ; 

V 

c J 

c J 

C. 


in other words, the rule which assigns to each co th e numb e r J c co is a linear function. 
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The simplest illustration of this is when we take the chain c to be a zero-chain, say 
the zero-chain which consists of the single point PeR 3 with the orientation +. In this 
case, the integration reduces simply to evalua t ion: the chain c gives rise to the linea r 
function on A 0 which assigns to each/e ,4° the value/(P). If we think of c as a unit 
charge placed at the point P and / as a potential, then f(P) is the energy 
corresponding to the charge distribution which places a unit charge at P when the 
potential is given by /. This reversed viewpoint, where we consider P as a linear 
function of /, has the following advantage. Suppose that we want to r eplace the 
discrete charge distribution concentrated at P by a ‘smeared-out’ or ‘continuous’ 
charge distribution, say with density p. Here p is assumed to be a smooth function of 
compact support. (Compact support means that p vanishes outside som e bounded 
set.) We can do so as follows. Consider the three-form (of compact support) 
pdx a dv a dz. For each feA° we can multiply f with pdx a dy a dz to obtain the 
three-form fpdx a dy a dz which is a three-form of compact support on R 3 . Having 
fixed the orientation on R 3 , we can integrate this three-form to obtain a number. 
Thus the form pdx a dy a dz defines a linear function on A °, the linear function 
which assigns to each feA° the number \ R3 fpdx a dy a dz. (Notice that if the total 



the point P - in other words, if we let the continuous charge distribution approach 
the point charge concentrated at P - then the value of this integral will approach the 
number /(P).) Let 43 denote thc space of three-forms of compact support on M 3 . 
Then we have shown that every pdx a dy a dzeA 3 defines a linear function on A 0 . 


In reality, as we know, the electric charges are discrete. But we can now get a grip 
on the notion of approximating the discrete by the continuous. Suppose we had some 
densely packed distribution of small charges, c(P ; ). Then we would get a linear 
function 


/^I £if(Pi) ► / • • Figure 16.1 

on functions. This linear function might be well-approximated (for a broad class of 


functions /) by the smeared-out charge distribution pdx a dy a dz in the sense that 


fpdx a dy a dz Figure 162 


is close to 

- Ic,/(f l ) -- 

We shall now introduce continuous approximations for one-chains, two-chains, etc. 
Indeed, let A 2 denote the space of smooth two-forms of compact support. We claim 
that we can regard an element Q of A 2 as a kind of smeared-out one-chain, i.e. that Q 
defines a linear function on A 1 . To see this let co be any element of A 1 and form the 
product m a Q. This is a three-form of compact support which we can again 








integrate over R 3 to obtain a number. In other words, QeA 2 defines a linear 
function on A 1 which assigns to each co the number j R 3<u a Q. Similarly, if we let A l 
denote the space of smooth one-forms of compact support, then each element of A 1 
defines a linear function on A 2 , so can be thought of as a kind of smeared-out two- 
chain, and, if A 0 denotes the space of zero - forms (i.e., smooth functions) of compact 
support, then each element of A 0 defines a linear function on A 3 and so an element of 
A n can be thought of as a kind of smeared-out three-chain. To summ arize: 

A 0 is paired with A 3 , 
_ A 1 is paired with A 2 , 

A 2 is paired with A lf 

A 3 is paired with A 0 , 

in the sense that, if c oeA* and Q e A 3 _ f , we get the number j R 3 co a Q. For fixed Q this 
is a linear function of co; also, for fixed co it is a linear function of Q. Also this pairing is 
inva r iant under any orientation-preserving one-to-one map <p of U 3 onto itself with 
differentiable inverse. If 4> is such a map, then, for any three-form, t, with compact 
su pport, the change of variables formula for integration says that On 

the other hand, cj)*(co a Q) = 4>*co a (j>* Q. Therefore 



% 

r_^_ 


4>*(o a 4>*Q — 

G) A Q. 


In other word s, the pa iring is invari ant un der orientation-p reserving ch anges of 
coordinates. 


16.2. The boundary operator 

Since we have the map d:,4‘->,4 i+1 we have the adjoint map, d, which assigns, 
to each linear function, c, on A i+1 the linear function dc on A 1 defined by (<3c)(co) = 
c(dco). This suggests that we should be able to find a map d from A i+1 to A t such that 
for any oeA i+1 and any coeA ‘ we have 


r 

(% 

I co a da = 

dco a a. 


R 3 


To find out what the operator d actually is we observe that 


d(m a (t) = dco a <r + (— 1 )'q> a dcr, 

and, by Stokes’ theorem (and the fact that co a a has compact support, so that we can 
replace integration over 1R 3 by integration over some finite region on whose 
boundary the form co a a vanishes), the integral J R 3d(co a a) vanishes. Thus 






J 
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In other words, 


1 = ~d 


A — ~ d on i- 


We thus have the series of maps 


A 0 —> A 1 -+ A 2 —»A 3 , 

d d d 

Ai ■*— A-, *— At <— An. 


The operator 8 is sometimes called the formal transpose of d. The reason for the 
word formal is the fact that we can express 8 purely in terms of d and that the formula 


- f - r - 

co a do = dco a a 

is only valid when the integration is over a region for which there are no boundary 
contributions. We have arranged this by having the integration carried out over 
all space and by insisting that a have compact support. We could equally well 
arrange that the same formula hold by insisting that co have compact support. 
Th at is, we could think of d as a map from A t to A i+1 and 8 as a map from A ' 
to A l+ 1 . That is, we might want to consider the forms of compact support as the 
fundamental objects, and consider that the map 8 is defined on the space of linear 
functions on A,-. _ 


16.3. Solid angle 

As an illustration of this reversed point of view, we shall do a basic computation. 
Let P be some point in IR 3 with coordinates x P ,y P .z P . Let x P be the solid angle 
form subtended from P which is defined to be the two-form 


lx — Xoldv a dz — 


)dx a dz + (z — z P )dx a d^ 


[R 3 ; it is only defined on the space [R 3 — {P}, i.e., on three-space with the point P 
removed. Nevertheless, the form x P does define a linear function on the space A v 






efficients and compact support. Then 


The coefficient of dx a d y a dz is not defined at P but the singularity at P is only 


ol order r 


acke 




- o u ang e 


Hence, the point P represents no problem in the computation of the integral. Since 
(O is of compact support, the integral over all space is well defined. In order to 
evaluate the integral, it is convenient to pass to spherical coordinates centered at 
p. If r P , 6 , and 0 are such coordinates, then 

t p = sin0d0 A Acj). 



(Thus tp gives the solid angle subtended by a surface over which it is integrated.) 
Suppose that cd = Fdr P + GdO + Hd(j> in terms of these spherical coordinates. Then 

cc> a tp = F sin 6 dr P Add a d </>. 

Thus 




r*2n 

_r 

r °° i 


CD A Tp = 


sin0< 

F(r P , 6,4>)dr P > d 6 d(f>. 

ft/ 

R 3 J 

0 J 

o l J 

o J 


L e t us n ow consider the cas e where cD = d u for so me smoot h func ti on u of co mpact 
“ support. Thus in the preceding” formula F = du/dr P in terms of the spherical 
coordinates and the innermost integral on the right reduces to the constant value 

- u{P ) so that integration with respect to 9 and <fr just multiplies by 4 n, the area 

of the sphere, and we have proved the basic formula 

_ /» _ 

du a ip = — 4izu(P). (16.1) 

Let d P denote the zero-chain which assigns to each function the value u(P), i.e.7^ 
d P represents a unit charge concentrated at P. Then the right-hand side of equation 
(16.1) is the value of the zero-chain — 4nS P when evaluated on u whil e the l e ft-hand 
side is the value of x P when evaluated on dw. We can thus write (16.1) as 

dx P = — 4nS P . (16.1a) 

(Notice that in the preceding equation we may not equate d with d. The form t p 
is not defined at P and hence dip is not defined there. At all points where t p is 
defined, we have dip = 0. Nevertheless (16.1a) is fine as an equality of zero-chains - 
that is, as linear functions on functions.) 

Suppose we have a finite number of points, P U P 2 ,P 3 , etc., and we place the 
charges c t at P lf c 2 at P 2 , etc. That is, we consider the zero-chain 

Q = Ci<5p, c 2 d Pl + • • •. 

Let us set 

D = ~ (CjTp, + c 2 Xp 2 +•••). 



Then D gives a well-defined linear function on A x and we have the formula 

dD - — 4nQ. 

If, instead of a finite discrete charge distribution, we are given a charge distribution 
with smooth charge density p of compact support, then we can replace the preceding 
definition of D, which involves a sum, by a corresponding integral. That is, we 
can write D — J p(P)x P dP a nd Q = pdx a dy a dz and again the preceding equation 
holds. In this case of continuous superposition, it is not hard to show, and we 
shall do so shortly, that D is a smooth two-form that is defined throughout all 
space. 


16. 4 . Electric field strength and dielectric displacement 

We are now in a position to define the fundamental objects in the theory of 
electrostatics. The electric field strength E is a linear differential form, which, when 
integrated along any path, gives the voltage drop across the path. Thus the units 
of E will be volt a ge/length . (Since volt a ge has units energy/charge and force has 
units energy/length we can also write the units of E as force/charge.) The basic 
equation satisfied by E is 

dE = 0. 

Locally, this is equivalent to the existence of a potential u, i.e., to the equation 

E = — du. (16.2) 

For most of the regions that we shall consider, this is the form of the equation 
that we shall use. 

We want to think of E as an incipient one-cochain and of u as an incipient 
zero-cochain. We now also need an object which should give a smeared-out one- 
chain. This will be a two-form, D. It is called the dielectric displacement. It is to 
represent a smeared-out version of the one-chain giving the branch charges. We 
also want a smeared-out version of the zero-chain representing the node charges. 

It will be some three-form pdx a dy a dz. In our network model we had 

d(br a nch ch a rges) = — node ch a rges .- 

So we expect that 

dD = — pdx a dy a dz. 

In fact, the standard definition of the units of D and p (in the cgs system) are 
such that this is true up to a factor of 4n: 


dD— — 4npdx a dy a dz. 


(16.3) 


This is known as Gauss’s Law. By Stokes’ theorem, and the fact that d = — d on 
two-forms, we can rewrite this as 
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relating the surface integral of D over the boundary of a region W to the total 
charge inside. 

Of course, D is not determined by the equation dD = - Anpdx a dy a dz. Adding 
an y closed form to D will not change this equation. The two-form D is p art of 
the data of an electrostatic system . In principle, it can be me a sured by the following 
procedure. Write 

D = D x dy a dz + D y dz a dx + D z dx a dy._ 

To measure D z , insert small parallel metal plates, lying in the xy-plane, into a 
cavity in the surrounding medium. Touch the plates together, then separate them. 
They acquire charges + Q. Then 

^ , charge on top plane (toward + z) 

D z = 4n lim - £ - £ —--- 

area-»0 area of plates 



Figure 16.4 


( The 47t is the fault of the cgs units.) This defini t ion works i n any dielectric, even 
if we do not know s, and it makes no mention of an electric field. 

In fact, the preceding definition extends to a coordinate-free definition of D: 
given any pair of vectors, and v 2 , form a parallel-plate capacitor whose plates 

d rP An-romo Kir 2iir n ti/4 liir /4ofrna 
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Since charge is additive, D is bilinear in and v 2 , so by its definition it is a two-form. 

We have now generalized the topological equations for capacitive networks to 
electrostatics in general. 



V = — dO becomes F. — — Air 






dQ = — p becomes 3D = — Anpdx a dy a dz. 



If D is a smooth two-form, dD = + Anpdx a dy a dz. 


We should recall that (having picked an orientation - say, the standard one) we 
can regard E as a linear function of D that assigns to D the number 

/* 

E a D. 

J R 3 

In this formula, we have, so far, regarded either D or E as having compac t support. 
In fact, this integral may be defined, for a particular pair E and D, so long as the 
product E a D vanishes sufficiently rapidly at oo. For example, if the produ c t goes 
to zero as r~ 4 or faster, the integral will converge. We shall need to keep this 
degree of flexibility in mind. 

We now have the smeared-out versions of the topological part of our theory of 
capacitive networks. We still need a version of the matrix C giving the relationship 
Q = CV. So we want a map 

D = C(£) 

sending electric fields into dielectrics. We will study this map C in some detail in 
the next section. Here we will draw some consequences from the following two 
assumptions about C. 

Recall that in network theory the map C was diagonal in terms of the branch 
voltages. In particular, if a voltage across a particular branch vanishes, then the 
corr e sponding branch charge vanishes. As a very mild analogue to this condition, 

we will assume that 

(a) C is local in the sense that, if E vanishes identically in the interior of some 
region, D = C(E ) also vanishes identically there. 

In analogy to the network case, we will assume that 

(b) The map C is symmetric. That is, if E t and E 2 are two electric fields, then 



m 

E l a CfE-,1 = 

% 

E 2 A CiEJ. 
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In the next section we shall describe the map C as it actually occurs in nature. 
We shall see that there is an intimate connection between the map C for vacuum 
and Euclidean geometry. In all media, conditions (a) and (b) hold. For now, we 
shall derive some consequences of (a) and (b). 

Conducting material, by definition, offers no resistance to the passage of electric 
charge. In other words, in a conductor the form E, which measures the work 
done in moving an electric charge, must vanish. Thus, using (a), we see that 
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In the interior of any conductor, the forms E and D must both vanish. 
Hence the function u must be constant on each connected conducting body. 

As £> — 0 in the interior of a con ducto r , p — 0 the re as well. S o there can be no 
charges (in electrostatics) in the interior of a conductor. Or, in the words of Faraday, 

All the charges of a conductor lie on its surface. 

The density of this surface charge is, of course, given by D. The total charge, p h 
on a particular conductor, is given by integrating D over its surface. 

Let us suppose that we have introduced charges only on the conductors. So we 
are assuming that dD - = 0 outside the conducto r s. Then for two such potentials, u 
and u, with corresponding D and D, we have 

d (uD) — du a D + udD 
= - E A D 


over the exterior of the conductors, as dD = 0 there. Thus, by Stokes’ theorem 


^ def 
(£,£) = 
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EaD = 

uD. 


region 

exterior to all *' 

1 all 

conductor 


conductors surfaces 

On the ith conductor surface, u takes on a constant value - say (Dj. We can thus 
pull u ou t ofthe surface inte gral and_so_ 


(M)= Z 4>p - 

conductors 

This gives Green’s reciprocity theorem as used in section 8 of Chapter 13. 

A third property of the map C is that the scalar product ( , ) is positive-definite. 
Thus if we take E = E in the preceding equation, we get 


(E,E) = £ <t»p. 


conductors 


If all the conductor charges are zero, the right-hand side is zero. By the positive- 
definiteness property, this would imply that E = 0 and hence that all the <&s are 
equal. So if we consider one conductor (say an imaginary conductor at infinity) 
as grounded, the charges p uniquely determine the potentials Similarly, the 
potentials uniquely determine the charges. We would then have a map from the 
space of all conductor potentials O to all conductor charges p and we would be 
able to use the results of Chapter 13. 


All of this is under the assumption that, given an assignment, <I>, of potentials 
to the various conductors, we can find the (unique) corresponding E and D. At 
this point the mathematician and the physicist part company. From the point of 
view of mathematics, an interesting and non-trivial problem has been posed: the 
Dirirhlpt prnhlem Tt is a question of an existence thnm-em in the theory of partial 
differential equations. It occupied the efforts of many gifted mathematicians in the 
last third of the nineteenth century. It was finally resolved positively, by several 












potentials on all conductors. Conductor number 8 will be at some constant 
potential, <p 8 , as a result of these charg e s. A priori, w e may not know what the 
induced charges and potentials are on the interior conductors. But suppose we 
consider a new function u' = u outside and u' = inside the cavity of conductor 
number 8. This is a solution to our nroblem with K = D " 0 inside the cavitv and 


zero charges on all interior conductors. By uniqueness this is our original solution - 
aS no charges have been introduced to any interior conductor; similarly, if we 
adjusted the potentials of any of the exterior conductors. In short (up to a meaning¬ 
less, overall constant in u), the interior cavity of a conductor is electrically screened 




construction of electrostatic measuring instruments where we do not want any 


.1 . t 




open - or replace a portion of the conductor by a wire mesh which is almost as 




the principle of electrical screening is a helpful computational device when used in 
conjunction with the following principle. 

Replacement of an equipotential by a conducting sheet. Suppose that we have a 
solution E = — d u, D = CE of some system of charges or potentials. Let. S he some 
surface on which u is a constant, d>. (S is called an equipotential surface.) Now 
suppose we replace S by a thin conducting sheet inserted at potential O. The 
nature of the ma p C i s su ch th at it is essent i ally una f f ected by t he in ser ti on of 
such a conductor. Thus the insertion of the conductor has practically no effect. 
In the interior of the thin conducting sheet, E and D have become zero, and charge 
has accumulated along its two surfaces, but elsewhere everything is as before the 
sheet was inserted. 




of a charge Q placed at the origin must be of the form 


(In the next section, we shall see that, if the system is invariant under all Euclidean 
motions, we must have 


for some constant c.) Now insert a thin spherical conducting sheet of radius a. 
This does not affect the fields either inside or outside the sphere, but D vanishes 
in the interior of th£ sheet. If we draw a spherical surface inside the sheet, the 
total charge enclosed must be zero. Thus a total charge — Q has accumulated over 


te inside surface ol 


lere and a total charge 


discharge the interior (by conducting a wire between the charge at the origin anc 






the field for the exterior of a charged spherical conductor is given by 


Qf(r) r^a. 

Of (a) r^a, 

where f{r) denotes the potential of unit charge at the origin. This method allows 
us to compute the field and the capacitance of a spherical capacitor whose plates 
ar e a pair of concentric spherical conductors: 



Again start with a charge Q at the origin, with the conductors absent. Insert the 
spherical conductors. Charges accumulate along the inner and outer surfaces. 
Di scharge the interior s urface o f the inner co nductor by c onnectin g a w ire to t he 
origin and discharge the outer surface of the outer conductor by grounding it to 
infinity. The fields inside the inner sphere and outside the outer sphere vanish. 
The function u between the two spheres remains the same as before with a charge 
Q on the outer surface of the inner sphere and — Q on the inner surface of the 
out e r surfac e . 



Figure 16.8 







Here is another illustration. Suppose we place a charge Q at the point 






0 

KM 

given by 


of the x-axis and — Q at the point 


0 

W 


. Then the potential at all points will be 


Q(f(r-)~f(r + )) 


whe r e r+ and r_ denote the distances to these two points. Under the assumption 
of Euclidean invariance, f(r) = cjr. For this situation, the equipotential surfaces 
look like figure 16.8. 

After we insert a conducting sheet at one of these equipotentials, we can abolish 
the field on one side of the surface by discharging the charge — Q. We then get 
the field of a point charge Q and one of these surfaces. For example, if we take 
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Figure 16.10 

% 

the surface to be the plane x = 0, at zero potential, we obtain the solution for the 
problem of a point charge in the presence of grounded plane conductor. A charge 
— Q is induced on the inner surface. The field on the other side of the conductor 
vanishes, while the field on the side of the charge is as if a charge —Q was placed 
on the other side of the conductor. This is an example of the method of images. 












Here is one final example that we shall use later. Consider a sphere of radius R. 



(x\ 


Place a charge Q at a point 

0 

>0, 

. Let x' be such that xx' = R 2 . By similar 


triangles 


x' R 


r X ' r x 


r r ■ xr. 


for any point on the sphere of radius R. 

So placing a charge — Q(R/x ) at the point 0 gives potential zero on the 

-Vhh 

sphere. This allows us to compute the potential for a point charge placed anywhere 
in the interior of a grounded conducting sphere. 
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We now must turn to the problem of finding the analog, in electrostatics, of the 
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The first property we recall of the matrix C is that it was diagonal in terms of 
the basis given by the branches. In other words, the branch charge contribution 


9 w mwJ w* 9 iili 1 W* tunkj iv 
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in electrostatics: we want a linear map E~>D such that the value of D at any point 
depends only on the value of E at that same point. In other words, if 



(' wna spac 


(where the coefficients E X ,...,D Z are functions), then the relationship between the E 
and D should be given by a matrix of functions 


( £ n) 

so that 

en£ x ~1~ £ i2 £ v £i$E z , 

D y = £ 2 i£ x + £22 + £ 23-E z > 

flz = £ 31 £ * + £ 32 £ v + £ 33 £ Z - 


Each of the entries £ 0 - will be a function, and the matrix function £ = (£ fJ .) gives 
the dielectric pr operties of the mediu m. We can write the relationship betwe en D 
and E as 


D = eE. 


So now the equations of electrostatics become 

E — — d u, — dD — — 4 7rp dx a dy a dz 

and 


D = eE. 


We may combine these equations by introducing the Laplace operator defined by 

(Aw)dx a dy a dz = — dedu 

so we get Poisson’s equation _ 

A u — — Anp. 

To make any further progress, we must get some further information about the 
matrix s. 

For general media, all we can say about e is that it is a symmetric matrix, for much 
the same reason that C was. Beyond that, we can say nothing. There do exist 
situations - crystals under stress, for example - where £ is a variable symmetric 
matrix. _ 

The medium is called homogeneous if £ is invariant under translations, and so is 
a constant. The medium is called isotropic if the relationship between E and D is 
invariant under rotations. We now examine what are the possible forms of £ if the 
medium is homogeneous and isotropic. 


16.6. The star operator in Euclidean three-dimensional space 

We begin by exhibiting a linear map 

--- ★:A 1 (^ 3 *)^A 2 (^ 3 *) - 

which is rotation invariant. Define 

*dx = dy a dz, 

*dy = dzAdx, 

★ dz = dx a dy 



and extend by linearity; that is, define *:A 1 (IR 3 *)-> A 2 (R 3 *) by 


*(adx + bdy + cdz ) = a dy a dz + b dz a dx + c dx a dy. 

Let 

ft) — adx + bdy + cdz 

and 

a — /4dx -f Bdy + Cdz 

and notice that 

*co a cr=(aA + bB + cC)dx a dy a dz. 

We can write this last equation as follows. The Euclidean scalar product on R 3 gives 
rise to a scalar product on IR 3 * - on one-forms - given by 

(a), ct) — aA + bB + cC. 

The scalar product, together with the orientation, picks out a preferred volume 
form dx a dy a dz. The last equation reads 

(*ft) a a) — (ft), ft)dx a dy a dz. (16.5) 

A moment’s reflection shows that this equation determines the map ★ uniquely. But 
the right-hand side of this equation involves (as a function of ft) and cr) only the 
scalar product and the orientation. Any rotation preserves these. Hence 
The * operator is rotation invariant. 

We claim that, up to a scalar factor, ★ is the only map of A*(IR 3 *)-> A 2 (IR 3 *) which 

isrotationally invariant.That is, we_ claim, that,, if r:AUlR 3 *)-->A 2 (JR 3 *) is - some- 

other map which is a rotation invariant, then r — a* for some scalar a. Indeed, 
we claim, first, that either r = 0 or r has zero kernel. Indeed, ker r is a subspace of 
IR 3 . If r is invariant under rotations, this subspace would have to be invariant 
under rotations. But there are no rotation-invariant subspaces of IR 3 other than 
the trivial spaces {0} and IR 3 . Thus either r = 0 or r is an isomorphism. If r = 0, 
there is nothing to prove. If r is an isomorphism, consider the map 

/ = r 1 *, /:(R 3 * ->• (R 3 *. 

By hypothesis, the map / is rotation invariant, i.e., 

l(Rco) = Rl(co) 

for any rotation R. Let us write 

/ft) = aco + Leo 

where Leo is perpendicular to ft). In other words, decompose Ico into components 
along and perpendicular to oj. By rotational invariance, the coefficient a given by 

(Ico, cd) = a || co || 2 

is independent of co. We claim that Leo = 0. Indeed, we must have LRco — RLco 
for any rotation R. Choose some rotation R which fixes co, but rotates non-trivially 
in the co ± plane. Then Rco = co but RLco ^ Leo if Leo ^ 0. Thus Leo = 0. Thus 

Ico = aco 



or 


roj = a*ca 


which is what was to be proved. 

Thus, up to scalar multiples, * is the only rotation-invariant map from A^IR 3 *) 

to A 2 ( R 3<1 )- _ 

(The converse is also worth noting. The * operator determines the scalar product 
( 5 ) occurring on the right-hand side of (16.5) - hence giving the * operator in 1R 3 * 
determines the scalar product and orientation.) 

We can now let 

co = adx + bdy + cdz 

be a differential form. That is, we can let a,b and c be functions. Define the * 
operator as before 

*co = ady Ad z + bdz Adx + cdx a dy. 

Once again_ 

★ co a t = (ft), r)dx a dy a dz 

where co and t are now diff e r e ntial forms. 

We can now use the assumption that our medium is homogeneous and isotropic. 
This implies that 



“ it is a property of the vacuum that £ = e 0 ★ where e 0 is a constant. Conversely, since 
the ★ operator from A X (IR 3 ) to A 2 (IR 3 ) determines the scalar product, we may say 
that the dielectric property of the vacuum determines the Euclidean geometry of space. 

In what follows, we will assume that the units of E,D and e have been chosen 
so that the e is absorbed into the ★ operator. Thus the equations of electrostatics 
have become 






d — ClWj 1/ — 'A Cli>^ — 4/rpuX /\ Cl^ A \XjL 


so 

Aft = — Ann 

1— 


’ 1 


where a direct computation shows that 

d 2 u d 2 u . d 2 u 

Au ~dx 2 + d7 + Sz 2 ' 


It is important to observe that the ★ operator is not invariant under differentiable 


maps. We do not have 

0*(*ft>) = ★0*(ft)) 

unless is an orientation-preserving Euclidean transformation. Nevertheless, we 
can compute the ★ operator in more general coordinate systems b y usin g its 


definition. For example, suppose we wish to compute the * operator in polar 
coordinates given by 

r = yj(x 2 + y 2 + z 2 ), 9 = arctan( j (x 2 + y 2 )/z), 0=arctan (y/x). 

If we calculate the differentials of these functions and express the results in terms 
of the basis dx, dypiz with coordinates expressed for convenience in terms of r, 6 
and (j), we find 

dr = sin 9 cos <j>dx + sin 6 sin (j) dy + cos 9 dz, 

d 9 = - (cos 9 cos 4>dx + cos 9 sin d y — sin 9 dz), 
r 

d 6 = — t- — sin 4>dx + cos tft dy). 

r sin 9 

Direct calculation, using the orthonormality of dx, dy and dz, shows that dr, d 9, 
and d</> are orthogonal elements of IR 3 * at each point (r, 9, </>), and that 

(dr , dr) = 1, (dfl, dft) - 4 > (d< j ), d < / >) = ■- 

r r sin z 9 

The best way to summarize all of these results is to notice that (dr, rd0,rsin#d</>} 
forms an orthonormal basis for A 1 (IR 3 ) so that we can calculate with these three 
differentials just as we do with dx, dy and dz. Thus 

★dr = r 2 sin 9d9 a d(ft, 

★ (rdfl) = — r sin 9 dr a d<ft, 

★(r sin 9 d<j6) = rdr a d 9. 

Of course, using linearity, we can then compute *d 9 or *d</>. 


16.7. Green’s formulas 


In this section we give the continuous version of the Green’s formulas of 
Chapter 13. Let U be a bounded region in R 3 with boundary dU. Let 

E = E x dx + E y dy + E,dz and _ F = F x dx + F y dy + FAz 

be linear differential forms defined on U. We define their scalar product ( E , F) v by 


(E,F)v = 


E a *F = 


u 


(E X F X + E F + £ z F z )dx dy dz = (F, E) v . 


We define the corresponding electrostatic energy to be j||£||^ where 




II E || u — ( E,E) V — 

(E x + E y + £ z )dxdydz. 

% 

Now suppose that 

w 1 

E = — du and F = — dv 


(d u,dv) v = 


u*dv — 


m- 


uAv dx a d y A d z. 


(16.6) 


This is known as Green’s first formula. Since (E, F) v = (F, E) v , we can interchange 
u and v in Green’s first formula and subtract. We get 



j 

A 

(u*dv — v*du) = 

(uAv — vAu)dx a dy a dz (16.7) 





r u 



which is called Green’s second formula. Let P be a point of U and let us draw a 
small hall B c . of r ad i us e centered at P and completely contained in U . Le t us 



Figure 16.12 











prove, u is determined by its values on dU alone. To begin, let us apply the preceding 
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jsjow r P is constant on 8W and, by Stokes’ theorem. 
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*du = 

— fl e i n C'p A;/ — 0 

-„ 

dW 
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Thus 
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But the expression on the right is just the average of u over the sphere dW. 
We have thus proved: 

If u is a function harmonic in some domain, then the value of u at any point 
is equal to its average value on any sphere centered at that point, whose 
interior is completely contained in the domain. 

We can draw a number of startling consequences. Suppose there is a point x 0 
and a neighborhood, Z, of x 0 such that 

«(x) ^ u(x 0 ) at all points in Z. 


Let S a be a sphere of radius a lying in Z and centered at x 0 . Then «(x) ^ u(x 0 ) at 
all parts of S a . Thus the average of u over S a is ^ m(x 0 ). Now suppose that there 
wer e so me point y on S a at w hich u(y) < n(x n ). Then u(x) < »(x n ) at all po i nts x 
near y, and therefore the average of u over S a would be strictly less than «(x 0 ). 
But this is impossible. Thus 

w(y) = w(x 0 ) at all points of S a . 



Now suppose that W is an open set that is connected ; that is, suppose that any 
two points of W can be joined by a continuous curve lying entirely in W. Suppose 
u achieves its maximum"at some x Q eW. Let y be any point of W, and let C be a 
curve joining x 0 to y. About each x on the curve we can find a sufficiently sm a l l 
ball with center x lying entirely in W. By the compactness of C we can choose a 
finite number of these balls which cover C. We can therefore formulate the 
following. There are a finite number of spheres S fll ,...,S flfc such that each sphere 
and its interior lie entirely in W,S ai has center x 0 , the center x f of S a +1 lies on 




S 0i , and y eS ak . (See figure 16.14). But this implies that «(x 0 ) = u(x J = •• • = «(x fc ) = 
u (y). In other words, we h ave es tablished : 

Let u be harmonic in a connected open set W, and suppose that u achieves 
its maximum value at some x 0 e W. Then u is constant on W. 

An immediate corollary of this result is: 

Let U be a connected open set and let U denote its closure, so U = Uu dXJ, 
where dU denotes the boundary of U. Suppose that U bounded. Then if 
u is a function that is continuous U and harmonic in U, 


unless u is a constant. 


w(x) < max u(y) 

ye dU 


Proof. In fact, since U is closed and bounded it is ‘compact’ and u is continuous. 
It is a standard theorem from real analysis that u must achieve its maximum at 
some point x 0 of U. If we could actually choose x 0 e U, then u would have to be 
a constant by our preceding results. If u is not a constant, then x 0 edU, and we 
have proved the proposition. 

F r om this, we deduce the following: 

Let U be a connected open set with U compact. Let u and v be functions that 
are continuous in U and harmonic in U. Suppose that 


w(y) = u(y) for all y edU. 

Then 

u(x) — v(x) for all xeU7 


Proof. In fact, u — v is harmonic and vanishes on dU. Thus u(x) — v(x) ^ 0 for xe U. 
Similarly u(x) — u{x) < 0, which implies the proposition. 

An alternative way of formulating the last proposition is to say that on a domain 
U a harmonic function is completely determined by its boundary values. This is a 
uniqueness theorem: there is at most one harmonic function with given boundary 
values. It suggests the problem of deciding whether the corresponding existence 
theorem is true. This problem is known as Dirichlet’s problem. 

Dirichlet’s problem. Given a continuous function / defined on dU, does there exist 
a function u that is continuous in U and harmonic in U and such that u(y) = f(y) for 
all yedin 


16.9. The 



Let us give an alternative proof of the uniqueness of the solution of Dirichlet’s 
problem for a bounded domain, U. We will use Green’s first formula (16.6) 


(d u, du)^ = 

u*dv — 

uAvdx a d v a dz. 

* 

Uu J 

u 


Let C? nt denote the space of differentiable functions which vanish on dU. Let H 
denote the space of harmonic functions on U - functions v that are continuous on U 
and satisfy Av = 0 in U. Then, if ueC° nt , then, since « = 0 on dU, 






u*dv = 0 for any smooth function v. 


If veH, then uAv = 0 for any function u. Thus 
If «eCf nt and veH, then 

(du,dv) u = 0. 

In other words, 

The spaces dC?,, and d H are orthogonal under ( , ) u . 

In particular, suppose u is harmonic and vanishes at the boundary. Then 

(dw, du) v = 0. 

But 




((du\ 

2 

( du\ 

2 

(du\ 

, 2 \, 

(du, du) v = 


IfcJ 

1 +1 

w 

1 + l 

Uj 

| dx a dy a dz. 
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U. But then u must be a constant, and since u = 0 on dU, u = 0. Thus a harmonic 
function which vanishes on dU must vanish identically. As before, this proves the 
uniqueness part of Dirichlet’s problem. 

Now the spaces dC°, dH, etc., are infinite-dimensional vector spaces. We have not, 
in this book, developed a theory of such spaces, or of projection operators n, on such 
spaces. Such a theory can be developed, and it can be proved that the following 
^principle , due to He rm ann W e yl, ho lds: 

The space dH is the entire orthogonal complement of dC^Ti^icte dC°7^ 
In other words, 

If E = dv and ( E , du) = 0 for all ueCf nt , then Av = 0 in U. 

Therefore, given any function i/', we can break dif/ up into its components 

d\l/ = du + dv where ueCf nt and veH. 

In other words, du = ndil/ where % denotes the projection onto the dC-^ component. 
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function 4> defined on dU, find a function v such that Av = 0 and v= $ on dU. 
Solution: choose any i p which agrees with 0 on dU and is defined throughout U. 
Decompose dt {/ into its components as above. Then, since u vanishes on dU, we 


Jotice that since {du,dv) u = d. we have 


IldiA \\v= WduWl+WdvWl 
\\d\l/\\ 2 v ^\\dv || J. 


Thus, among all functions which take on the values on the boundary, tne 
solution, v, of the Dirichlet problem can be characterized as the function with the 














some general principles of electrostatics. 

16.10. Green’s functions 

Suppose that U is a domain for which the Dirichlet problem can be solved. We 
shall now develop, for U, the analog of the Green’s function that we have discussed 


r or each x in [Hr, let us set 

K(x, y) = -!- 1 

4n || x — y 


*dK(x,) = —r x . 

4n 

We can rewrite equation (16.8) of section 16.6 in terms of K as 
u(x) = f (u * d K (x, •) + K(x,‘)*du) — f K (x, ■) Aw 


u(x) = I (u*dK(x, •) + K(x, j*du) — | K{x,-)Au dV. 

J ev J v 

Now for fixed xeU the function K(x,\) is a differentiable function of v as v varies 
U and continuous in U and such that 


Furthermore, for fixed y,h(x,y) is a continuous function of x. In fact, by the 


on oU, and hence 


and K(x, z) is clearly uniformly continuous in x for all ze dU so long as x stays a 
fixed distance away from dU. We have thus constructed a function h f , such thatT 

(i) hfjix, y) is a continuous function of x and y for x,y eU, and is differentiable 
in y for yeU; 

(ii) For each fixed x, A y h v = 0; i.e., A h v {x, •) = 0; 

(iii) G v (x, y) = K(x, y) + h v (x, y) = 0 for yedU. 

The function G v is called the Green’s junction of the domain U. Let us suppose for a 
moment that G v exists, and let us derive some of its properties. We write G — G v 









Let B x e and B y e be small balls about x and y. Let u — G(x, •) and v — G(y, •) in 
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arc harmonic in the domain and vanish on dU, we obtain 


G(x, )*dG( 


>*dGt 


= G(y,-)*dG(x,Q + G(y,-)*dG(x, •). 

v dB Xt e J d£y >e 

We will show that the left-hand side approaches G(x,y) and the right-hand side 

and side. 




4n*dK is the solid angle about y. The second term tends to zero, since G(x, •) and h 
are smooth fu nctions on B y e . On the other han d, 


G(x,-)*dG(y, •)= -K(x,-)*dG(y, •)+ /i(x,-)*dG(y,-). 

J ^x,8 J J dB x ,p 

The second term tends to zero, as above. The first term can be written 






since G(y, •) is harmonic in B x r This proves that G(x ,} 
Let u be any smooth function on U. Apply Green’s 
v = G(x, •) on U — B xe . We get (since G(x, •) = 0 on dll), 

* C. H (* 

u*dG(x,-)— «*dG(x, •) + G(x,-)*d« = 


that G(x, y) = G(y, x). 

V Green’s second formula to u and 


«*dG(x,-) + G(x,-)*d« = G(x,-)A«dx 1 a ••• a dx". 

% dB XjE v U 




K(x,-)*du + 

r 1 

h(x,-)*du = -— 

(% 

★d u + 

/» 

h(x,-)*du 





3Bx >e 4ne J 

dB x ,e » 

dBx,e 



-O(fi) + 0(e 2 ) 


and so tends to zero. We get 


w(x) = , G(x,•)Audx a • • • a dx" + u*dG(x, ■ 


‘ ' a solution 






Similarly, if we know that there exists a smooth solution to the problem 


Au = 0, u(x) = f (x) for xedU, 

then it is given by 



1 f* i 



u(x) — 

w*dG(x, •) solution of the Dirichlet problem (16.12) 


— 1 

V w 1 



16.11. The Poisson integral formula 


It is important to observe that these formulas are consequences of the existence 
of Green’s function for U. Thus they are valid whenever we can find the function 
h such that properties (ii) and (iii) hold. For example, we can explicitly construct 
the Green’s function for a ball, B R , of radius R. For simplicity, let us assume that 
the ball is centered at the origin. 

For x # 0, let x' be its image under inversion with respect to the sphere of 
radius R: 


R 2 



(see figure 16.11.) 


Define the function G R by 
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if y -/• n 

47rG R (x,y)= i 

1 
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If X6 B r , then x'eB R , and so the second terms on the right-hand side of the 
equation are continuous and harmonic on B R . We must merely check that property 
(iii) holds. Now for || y || = R we have, by similar triangles (or direct computation), 


so that 


R _ I I y - I I 

X II II y — x || 


Gu(x,y) = 0 for ||y||=K. 


This is (iii), so we have verified that G R is the Green’s function for the ball of 
radius R. 

To apply our formula for the solution of Dirichlet’s problem using the Green’s 
function, we must compute *dG K on the sphere of radius R. Now 


47i*dG R (x, •) = t x 




y f ~ xj 


R y 1 — x f 

-— , *dy ; . 

l l x|| | | y — x || 3 - 


, a ormu a 


But 



w(x)= «(y)*dG(x,y) 

J dU 



But ( *rdr ) = RdS R , where dS^ is the volume element of the sphere of radius R. If we 
substitute into the last formula, we obtain 


u(x) = 


dS R (the Poisson integral formula). (16.13) 


In the proof of (16.13) we used the assumption that the function u is differentiable 
in some neighborhood of the ball B R and is harmonic for || x || < R. Actually, all 
that we need to assume is that u is differentiable and harmonic for ||x| < R and 
continuous on the closed ball || x || ^ R. In fact, for any || x || < R, equation (16.13) 
be valid with R replaced by R a , where ||x|| <R a <R. If we then let R a 
approach R, we recover (16.13) by virtue of the assumed continuity of u. 





open ball and continuous on the closed ball, it satisfies (16.13). Now let us show 


that 

vain 

V 

les. Thus suppose we are given a continuous function u defined on t 
Then we are given tt(y) for all y eS R . Define «(x) for ||x|| < R by (1 

bounaarv 
he sphere 
6.13). We 

mus 

it show that 



(a) u is harmonic for | x || < R, and 



(b) n(x)-m(y n ) if x->y 0 and y 0 = R. 


T 

the 

o prove (a) we observe that G R (x,y) is a differentiable function of x 
range || x || < Rj^ < R,Ri < II y || < R 2 /Ri, and is, by construction, a 

; and y in 
harmonic 
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function of y. For || x || < R and || y || < R, we know that 

G r (x, y) = G R (y,x). 


i nus ior iix c a y wnn \ \ y » < k , u r (-, y) is a harmonic tunction on b r - (yj. Le t ting 
||y|| -»/?, we see that G R (x, y) is a harmonic function of x for ||x|| < R 1 <R for 
each fixed yeS Thus dGJx^yVdy 1 is a harmonic function of x for each veS^ Tn 
other words, all the coefficients of *dG R (-,y) are harmonic functions of x for each 
y eS R , and therefore so is each coefficient of w(y)*dG i? (-,y). It follows that the 
function u(x) = J SR u*dG Jl (x,-) is a harmonic function of x, since the integral 
converges uniformly (as do the integrals of the various derivatives with respect to x) 


o prove 


we first remarl 


le constant one is a harmonic function 


- x 


■ = 1 for any || x || < R. 


(16.14) 


Now let y 0 be some point of S R , and let u be a continuous function on S R . For 
any e > 0 we can find a <5 > 0 such that 

l«(y) — w(y 0 )l < e for ||y-y 0 ||^2<5 y eS R . 

Let Z x = { yeS R : || y - y 0 1| > 25} and Z 2 = (y eS R : || y - y 0 1| < 2^}. Then by (16.13) 
and (16.14) we have, for | |x[| < R , _ 

^ -R 2 ~l|x|| 2 f u{y)-u(y 0 ) 


U(x)-u( y 0 ) 


4*R- 




Now if || y 0 — x || < (5, then for all yeZ x , we have || y — x || > || y — y 0 1| — || x — y 0 1|, 
so that || y — x || > S. Thus for all x such that || x — y 0 1| < S the integral occurring 
in 1 1 is uniformly bounded. Since ||x|| -+R as x-»y 0 , we conclude that / 1 -> 0 as 

x -^y 0 - 




by (16.14). This proves (b). 
We have thus proved: 



fheorem: Let u be a continuous function defined on the sphere S R . There is a 

. ___ f. _^ •_ l /> n _ _ _ - 
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sphere S R and is harmonic for ||x|| < R. This function is given by (16.13) for 

all U »ll<*- _._._ 

We have exerted a subst a nti a l amount of effort in giving a det a iled proof of 
the existence of a solution of the Dirichlet problem for the simple case of the 
sphere. We will not give the details of any of the various proofs which work for 
more general regions. But here is an outline of a method of proof which can be 
made to work with some effort. 

Suppose we approximate Euclidean space by a network. Let us place a node at 
every vector v whose coordinates are dyadic fractions with denominator 2 N - that 




this network, let us assign capacitance 2 N to each branch. Let A N denote the Laplace 
operator for this network. For any twice-differentiable function u . it is easy to see 
that 


If we are now given a bounded region U with boundary dV, we can consider the 
network of all nodes in U and all branches joining these nodes. We can declare 
the nod e s clos e st to the boundary to b e boundary nodes. By our results for finite 
networks, we can solve the Dirichlet problem for this finite approximation. Given 
any continuous function on dU we can assign values, on the boundary of 
our finite approximation by taking </> N (n) = 0(x) where x is the nearest point to n 
on the boundary. (If there are several points equally near, choose x to be one of 
them.) In this way, we have approximated the continuous Dirichlet problem by a 
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discrete, finite problem we know how to solve. We expect, as N-*■ o o, that the 
sequence of solutions U N will converge to a function u (defined on all the dyadic 
vectors and extended by continuity to all) which solves the honest Dirichlet 
problem. In fact, this method works, at least if the shape of the boundary is not 
too weird. The proof requir e s careful estimates which are b e yond the scope of this 
book. 


Summary 

A Electrostatics and differential forms 

You should be able to state the laws of electrostatics in terms of the differential forms 

D and E. 

You should know how to state, prove, and apply Green’s formulas for 
electrostatics in IR 3 . 


B The star ope r ato r in IR 3 

Given a differential form cd on IR 3 , expressed in terms of coordinates whose 
differentials are mutually orthogonal you should be able to construct an ortho- 
normal basis and use it to construct *co. 

You should be able to describe the relation between electrostatics in IR 3 and 


capacitive networks, exp laining how t h e star op erat or p lays t he sa m e role as the 
capacitance matrix. 


_ Exercises _ 

16.1. Let D = t 0 = (xdy Ad z + ydz a dx + zdx a dy)/r 3 where r 2 — x 2 + y 2 + 
z 2 . This represents the electric displacement for a unit positive charge at 
the origin. 

(a) Evaluate J Kl £> where R t is the disk z = z 0 , x 2 + y 2 <a 2 . 

(b) Evaluate f R2 D where R 2 is the curved surface of a cylinder: 

— z 0 < z < z 0 , x 2 + y 2 = a 2 . 

(c) Using the results of (a) and (b), check that = 4n, where R 3 is the 
cylinder with curved surface R 2 plus disks like R l at its top and 
bottom. 

16.2. Let A be the one-form A x dx + A y dy + A Az. Thejquantities A x , A y , and A, 
are all functions of x, y, and z. Using the star operator * defined to be linear 
and to satisfy 

*dx = dy a dz, *dy = dz a dx, *dz = dx a dy, 

*dyAdz = dx, *dzAdx = dy, *dxAdy = dz, 

*l=dxAdyAdz, *dxAdyAdz=l, 

compute the following: 

(a) *d/l; 

(b) *d*^4; 
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(c) *d*d.4; 

(d) *d*d/ where f(x, y,z) is a scalar field; 

(e) *{A a B) where A and B are one-forms; 

(f) *(A a *B) where A and B are one-forms. 

16.3. Cons i der spherical coordinates r,fl,0 defined by x = rsinflcos0,}; = 

r sin 0 sin 0, z = r cos 0. 

(a) Express dx, dy, dz, dy a dz, dz a dx, dx a dy, and dx a dy a dz in 
spherical coordinates (i.e., use the pullback equations for x, y, and z to 
pull back these differential forms). 

(b) The star operator defined in Exercise 16.2 is a linear transformation 
specified by its action with respect to basis vectors dx,dy,dz, 

dy a dz,_ Express this operator in spherical coordinates; i.e., 

calculate *dr, *d0, and *d0 in terms of d 6 a d0, d0 a dr, and dr a d 9, 
and calculate *(d6 a d 0), *(d0 a dr), and *(dr a d9) in terms of Hr dft, 
and d0. 

(c) Express the two-form r 0 of exercise 16.1 in spherical coordinates, and 
show that it equals * d(l/r). 

16.4. If u represents an electric potential function on IR 3 , its Laplacian Au may 

be defined by 

Au dx a dy a dz = d*du. 

(a) By regarding u as a function of x, y, and z, confirm that 


Au 


d 2 u d 2 u d 2 i 


dx 2 dy 2 dz 2 


(b) By regarding u as a function of r, 0, and 0, develop a formula for Au in 
terms of partial derivatives of u with respect to r, 6, and 0. 

16.5. Consider a two-form D expressed in spherical coordinates by D = 
r 3 (l — r)sin Odd a d0 (r ^ 1), D — 0 (r > 1). 


(a) Verify that if E — r(l 
*dr = dx a dy A dz.) 


r)dr for r ^ 1, then D = *E. (Note that dr a 


(b) Find a potential u such that E — — du. Also evaluate 8D (which equals 
— dD). Now check that = — J\ D u, as required by Stokes’ theorem. 
(Recall that \ D E = J^E a D.) 

16.6. The differential form that represents D for a unit positive charge may be 
expressed in spherical coordinates as t = sin 9 &Q a d0. 

(a) Evaluate [ s t where S is the sphere x z 


7 y r Tz 2 


= 1 . 


(b) Show that dx = 0. 

(c) Using Stokes’ theorem, explain why there cannot exist a one-form 


except possibly at the origin, for which 


d co = x. 


(d) Find a one-form co, defined on the octant of IR 3 where x > 0, y > 0, 
z > 0, for which x = dco. You will probably want to use spherical 
coordinates to invent a suitable co, but then express it in terms of x, y, 
and z. 


16.7. Let u(r, 0) be a smooth function on IR 2 with compact support. Evaluate 

xdy — ydx 

du a co where co = d6 = —--r—. 

_Jr 2 _ x z + y 2 _ 
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(Because co is undefined at the origin, you should first integrate over the 

region outside a disk of radius e centered at the origin, then take the limit 

as e -*• 0.) 

16.8. Let 0 be a rotation in R 3 , so that 0 is represented by a matrix A for which 

AA r — I and Pet A = + 1. Let co be a one - form co = F r dx + F y dy + F z dz. 

Prove that in this case 

*(0*ft>) = 0*(*co). 

16.9. Cons i der the five functions 

v l = r 2 , v 2 = r 2 cos 2 S( = z 2 ), v 3 = r 4 , v 4 = r 4 cos 2 6 ( = r 2 z 2 ), 
v 5 = r 4 cos 4 0( = z 4 ). 

These span a five-dimensional vector space which we shall call C°. 

(a) Calculate dv, *du, and Av for each of these five functions. 

(b) Calculate all the scalar products of the form (diq, dv 2 ), letting U be the 
interior of the unit sphere, r ^ 1. 

(c) Find a basis for the two-dimensional subspace H of C°, which consists 
of functions with Av = 0. 

(d) Construct the matrix (7 — n), which is the orthogonal projection of 
dC° onto dH. Thereby you can also write down n. 

(e) Use 7 — 7i to find a function u which satisfies Au = 0 and which equals 
cos 4 6 + cos 2 6 on the sphere r = 1. (This is a Dirichlet problem.) 

(f) Use n to find a function 0 which vanishes on the unit sphere, for which 
A0 = 5r 2 — 4r 2 eos 2 6+1. (Now you are solving the Poisson 
equation.) 
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Chapter 17 continues the study of the exterior differential 
calculus. The main topics are vector fields and flows, interior 
products and Lie derivatives. These are applied to 
magnetostatics. 


17.1. Currents 

At first glance, it would seem to be a straightforward matter to give the continuous 
version of resistive networks. All we need to do is make some minor changes in 
the discussion of the preceding .chapter, We keep the form E and the equation 
E =• — d u. We replace energy by power. We replace the two-form D of electrostatics 
by a two-form, J, which should represent the smeared-out version of the branch 
currents. We have a pairing between the current J and an electric field E given by 

— £ a J 

* 

measured in units of power. Kirchhoff’s current law said that 81 = 0. So the corres¬ 
ponding smeared-out version says 

dJ = 0 

or, since d = — d, that 

dJ — 0. 

Inste ad of £) = «* £ for a mi sotrop ic ^lectrostatic med i um, we will hav e 

, J — a*E 

where a is called the conductivity. If we define r = 1/ct and call it the resistivity, 
then we can write the preceding equation 

E~=r*~ 1 J 

and this is the smeared-out version of Ohm’s law 

V = RI. 
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We should now pause for a moment to see what the form J represents. Suppose 
we consider some surface S. What is the meaning of § S J1 Suppose we had a finite 
network and the surface crossed certain branches, say a,/? and y, oriented as 
indicated in figure 17.1. Then, taking orientation into account, we can think of S 
as defining the one-cochain which assigns to each current I the number I a — l p + J 
More generally, S defines the cochain which assigns to each current / the total 
current across S , taking orientation into account. Thus, in our smeared-out version, 
we should think of j S J as the current across S. 

If we think of current as charge in motion, then we can think of j s J as the rate 
of charge transport across S. If S is a closed surface, S = dU, then 



r i 

* 


ii 

dJ = 0 

* 

rs J 

u 


so the total rate of transport across S is zero, no charge can be created in the 
interior. This fits with our original discussion of Kirchhoff’s current law, so all 
works out fine. 

There is a point of geometric interpretation which does require further 
explanation. We know that in reality the electric current consists of electrons in 
motion. Each electron, thought of as a point particle, moves along some curve. 
Thus the form J, integrated over a surface S, is really counting the number of 
electrons crossing S (in either direction) per unit time. We should have some 
geometric object, more directl y related to th e fami ly of jmrves of the in dividual 
ele c trons, which d e s c ribes their motion, and from which we c an reconstruct-the 
two-form J. As we shall see, the correct notion here is that of a vector field. 


~E7.2. Flows and vector fi elds 

In this section, to avoid an accumulation of indices, we shall work for the most 
part in [R 3 . But we will not make any special use of three dimensions, so our 
formulations are valid in any dimension. 

Let U be an open region in (R fc . Let / be some interval in IR containing the 
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T 


/ 


origin. Let 0: / x U -► [R k be a differentiable map. We think of tel as representing 
a n instant of time. We can then regard (j> in two different ways: Suppose we fix 
some pel/. Then the map 



is a curve in U k . Of course, for each different p, we get a different curve. We can 
consider the tangent vector, p) = (d/dt)(j){t, p). On the other hand, for each fixed 
we can consider the map <j) t : U -*■ U k given by 

<f>t( P) = P) - 

Let us assume that 


4>o — id, 


i.e., that 

0(0,P) = P 

fo r all p. Then for e ach p, the tangent p) to the curve 0(, p) is a tangent v e ctor 
at p. We thus have a rule which assigns to each point p of U a tangent vector at 
p. Such a rule is called a vector field on U. We will denote a vector field by a 
symbol such as Thus £(p) is a tangent vector at the point p. 


Fi gure 17 .2 


For any p, the assumption </> 0 = id implies that d0 O = id so d(0 t ) p is non¬ 
singular for small enough t. Thus by the implicit function theorem, is one-to-one 
with differentiable inverse in some neighborhood of p. In order to avoid accumu- 


differentiable inverse on all 


of U. Fi x some time sel. Then define 


0f = 0t°0 s : - 

Then f/ t has all the properties of cj) t , but now 


4>s = id- 


Thus for each sell we get a vector field, call it £ s . In short, we have shown that, 
given 4>:I x t/-»[R k , we get a one-parameter family of vector fields £ s . For each 
pel/ there will be some neighborhood W of p and some neighborhood k o f 0 


Example: 

(a) Suppose w is some fixed vector. Let 

_ 0(f, P) = P + tW. 
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Then p) — w for all p. Also 

p)= <i>t( p — 

= p + (r- s)w 
= 
so 

£ s (p) = w for all v and s. 

(b) Let A be a matrix. Define 

(f)(t, p) = e tA p. 

Then 

p) = Ap. 

Also 

4>t°4>s y i p) = e M (e~^p) 

= e (t_sM p 

= 4t-,( P) 
so 

£ s (p) = Ap for all s. 

The vector field in this case assigns to each point p the vector Ap. 

Notice that, i n both examples, the vector field did not depend on s. This was 
— a c onsequence of the id e ntity - 

fa 0 #* 1 = 

or, replacing s by — s, 

<f>S+t </> S ° 

A map (j) satisfying this identity whenever both sides are defined is called a 
(stationary) flow. In case 4> t (U) = U and I = 1R, we also speak of a one-parameter 
group of transformations of U. 

Thus, for flows, we get a fixed vector field, £ = all s. 

Suppose we start with a linear vector field 

f(P) = Ap 

where A is some given matrix. Then, by the results of Chapter 3, we know that 
we can form 


and then we have </> defined by 

- 0(£, P)~e M p- 

Thus we have an existence theorem in ordinary differential equations. Starting 
with 4 we have found the (f>. The corresponding theorem is true in general: 

Let £ s be a vector field depending smoothly on s and p defined for sel and p elJ. 
Then each p has a neighborhood W of p in IR fc and a neighborhood K of 0 in 1R such 
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that (f>'-K x W-+M k , 4> 0 = id and is smooth and the s-dependent vector fields defined 
hy 4) are precisely If the = g ( i.edo not depend on s ), then <f> satisfies the identity 


(t>s + k=<Ps°<t>t 


wh e never both sides are defined. 


The proof of this theorem is not difficult. One proce e ds by a method of successive 
approximations. We will not present the proof here. It is standard and can be 
found in any text on ordinary differential equations. For example, a proof is given 
in Loomis & Sternberg Chapter 6. 

Suppose that £ is a vector field and / is a differentiable function. At each point 
p. we can construct the directional derivative 

the function which assii 


D /. For example, suppose we are in U 3 and the vector field is given by 


l(p)= 

Hp) 



U(p)J 



where a, b and c are functions. Then 








D m f = a(p) ir (p) + b( p) (p) + c(p)—(p). 


dx 


dy 


8z 


Thus 


- D,/=a^ + 4 + 4. - 

6 ox dy oz 

For this reason we shall use the following notation: we shall write 

e 5 5 5 

j=fl ^ +b o + ‘ : ^ 

In this notation the symbol d/dx, for example, is thought of as the constant vector 
field 

7i\ 

0 


But 


w 


tt 

dx 


^d/dxf — 


so this constant vector field acts like the partial derivative. So, similarly, the vector 
field 







can be thought of as the function which assigns to each point the row vector 



We can (at each point) evaluate this row vector on the column vector b to 



4>: I x U -> IR ft be as above and % s the corresponding vector fields. For each t we can 






is again a perfectly good function. Evaluating this function at some point v and 
letting t -> 0, we see that 


lim-(0*/-/)(p) - lim-[/(0(f,p)) -/(p)] 
o t t 



So, as functions, 
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We have now seen that a natural geometric object to attach to a flow is a vector 
field. But we have seen in the preceding section that the electric current (which is, 
after all, a flow of electrons) is given b y a two-form. So w e need a mathematical 
operation which allows us to pass from the vector field describing the flow of the 
electrons to the two-form describing the current. The two-form will depend also 
on the density of the electrons. Obviously a higher density of electrons moving 
along the same flow lines will produce a larger current. So we really need a 
mathematical operation which will pass from vector fields, and three-forms 
pdx a dy a dz to two-forms. We describe this operation in the next section. 


17.3. The interior product 


Le t V be a v e ctor spac e and co an element of A fc (L*). R e call that co is a function 
(multilinear and antisymmetric) of k vectors in V. For every v l 5 ...,v k , we get a 
number 

tu(v l5 v 2 ,..., v fc ). 

Now let v be a vector in V. We will define a function of /c — 1 vectors of V by 
of *!,...,***_! given by 

This function (of the w’s) is again multilinear and antisymmetric. Hence it is an 
element of A fc- 1 (F*). We call this function the interior product of v and co and denote 
it by 


Thus 


i(v)ft). 


i(y)co(w i,..., w t _ J = cu(v, w u ..., w fc _ J. 


Let us see what the interior product looks like in various cases. Suppose we are in 
5 3 and take a basis: 





_qJ 

and dual basis element dx = (1,0,0), 

s / 


O 1-H C 

II 

and dual basis element dy = (0,1,0), 


\ / 

(^ 
1= o’ 

and dual basis element d? = (0,0,1). 


^— b 



Then if k = 1, so co is a one-form, i(v)ct> is just u>(v), a number. Thus 

i (^) dx=1 ’ i tt) d3 ' =0 ’ '(i>=° 


e tc. 
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If k = 2, then 





in erior proc uc 


So 


u \ 

i \ ^ (d x a dy a d z) = dy a dz. 


-3A 


Similarly 


M -r- ) (dx a dy a dz) — — dx a dz 




and 


i ( 1 (dx a dy a dz) = dx a dy. 


Now i(v)a) is clearly bilinear in v and co. So, if 


. 5 d d 

4=a s^ +b r y +c ^ 


co = pdx a d v a dz. 


and 


then 

i (%)co — pa dy a dz — pb dx a d z + pc dx a dy. 

We hav e d e fined the interior product i(%)co for v e ctors, g, and for co in A fc (K*). But, 
using this definition at each point, we can equally well define the interior product 
i(£)co where £ is a vector field, co is a differen t ial fo r m of degree k. In [R 3 . all the 


In particular, if we start with a vector field, £, representing the stationary flow 
of the electrons, and a three-form co = pdx a dy a dz giving (a smeared-out 
approximation to) the electron density, we get a two-form 


J = i(Z)co 

giving the current. The explicit formula for J in coordinates was given by the 
preceding formula. In order to understand why J represents the current, we can 
use the basic definition of interior product. Let us look at a point p and a small 
parallelogram spanned by vectors v/ 1 and w 2 at p. The vector v = h%(p) represents 



tion to the motion of a particle situated near p in a small time 
interval, h. In other words, the parallelepiped spanned by v, Wi and w 2 gives 
(approximately) the region swept out by the parallelogram in time h. The total 





charge in this parallelogram is given by 

ft>(V>W 1 ,W 2 ) = [t(y)qj]( Wl , W2 ) 


= Mp)(Wi, w 2 ). 




parallelogram spanned by w x and w 2 . So J, when integrated over a surface, gives 
the rate of flow of ch a rge across that surface. 

Let us conclude this section by proving the following important formula con¬ 
cerning the interior product: 


1 \dego>i 


(A check shows that this formula is true in all the cases in IR 3 that we have 


iviiiiValll r. 
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But since (17.1) is purely algebraic, it is enough to prove it at each point, i.e., to 
prove it when ^ = v is a vector in V and co l is an element of A^L*) and co 2 an 
element of A*(L*). Both sides of the expression are linear in £, co 1 or o> 2 when 
the other two are held fixed. We may write £ = a jVj + • • • + a„v„ where (v l5 ..., v„} 
is a basis. By linearity in £, it is enough to verify it for £ = v £ , a basis element. 
With no loss of generality, we may assume 

g = vi- 

Let us first prove the formula when co 1 is a one-form. By linearity, we may 
assume that 

a ) 1 = vf i=l,...,n, 


L " 


MarijiiuiiiBiiMi 



= Det 


Vl(Wi) vf(w x ) vJ(Wi) vKWi) 
vf(w 2 ) v|(w 2 ) v?(w 2 ) vj(w 2 ) 



T*)pf 

L/tl 


( v!(Wi) 


V^Wi) 

m 


\vf(w 3 ) v£(w 3 ) 


v|(W 2 ) 


= (?* A V* A V*)(w 1 , w 2 , w 3 ), 


Thus the determinant expression ior (co 1 a a> 2 )(v,w l5 ..., w*) is the same as the 
expression for co 2 (w 1 ,...,w i ). Thus 


a co 2 ) = co 2 = [i(v)a> 1 ]a) 2 — co x a i(v)co 2 


since i(v)co 1 = 1 and i(y)co 2 = 0 in this case. 

We now turn to case (b): i = 1, j x = 1. Then co 1 a co 2 = 0. Now i(v 1 )cu 1 = 1 so 


(iiv^cOi) a co 2 = co 2 , 

while 

;( v l)tt> 2 = v£ A ■ • ■ A V* 


so 

&>1 A i^)^ — V* A \* 2 A • • • A V* 

— ctJ 2 . 

Thus 

/(VjicOj A ft) 2 -— co 3 a i(v 1 )ct) 2 = ft) 2 — ct) 2 = 0. 


Now consider case (c) where i > 1 and = 1. Then we can write co 2 = v* a co 3 and 
the left hand side of (17.1) can be written as 

i(v) [co x a v* a ffl 3 ] = - i(v)[v* a (a*! a co 3 )] (by interchanging y\ with <x> x ) 

= — co x a cy 3 

by case (a). The first term on the right hand side of (17.1) vanishes and the second 
term equals — co x a co 3 by another application of case (a). 

Finally, if (d) i > 1 and j x > 1, then 

*(vi)(n>i a© 2 ) = 0 

i{vi)co l =0 

and 

i(v 1 )co 2 = 0 

so both sides of our equation vanish. We have now proved formula (17.1) 
for the case that degn^ = 1. But then the associative law for exterior multiplication 
will allow us to prove it in general. For example, suppose that co 1 =a 1 a g 2 where 
deg a x = deg a 2 — 1. Then 

i{\){a x a a 2 a <y 2 ) = i(y)o x a (<t 2 a co 2 ) — a x a i(v)(<r 2 a co 2 ) 

<= i(y)<J x a ( a 2 a co 2 ) — o x a i(y)a 2 a co 2 + a x a a 2 a i(y)co 2 
= [7(v)(j! a a 2 — a x a i(v)ff 2 ] a co 2 + a x a a 2 a i(v)co 2 
= ?’ (v)(cr 1 a a 2 ) a co 2 + (g 1 a <t 2 ) a /(v)cl> 2 . 

This proves the formula when ca, = cr 1 a a 2 and hence, by linearity, for all cases 
with deg a)! = 2. We can continue to prove the general formula by induction. For 
example, if deg <7! = 1 and dega 2 = 2, 



l(v)(<7! A a 2 A C0 2 ) = A (<r 2 a C0 2 ) — Oj A i(v)(<7 2 A (U 2 ) 



— i(v)(7 1 Aff 2 AcOj-ff! A l(v)<7 2 A C0 2 — O’ i A <7 2 A i{\)(0 2 

— I ( v )( <T l A cr 2 ) A C0 2 — (T 1 A a 2 A i(v)co 2 


if Hearn. 

/\ UJ 2 = i^VJCOi A C0 2 — COi A HV j |(U 2 

— 1 and so on. 

1 


17.4. Lie derivatives 


Let 0: J x L/-> [R* be as in section 17.2. Thus we have, for each tel, the map 

<j> t : U->U k , 0 t (p ) = 0 (t,p) 

and we assume that 

0o = id * 


Let £ = £ 0 be the vector field associated to 0 t at £ = 0. 

Let co be a differential form of degree k. We can consider the form 0*eo and hence 
the form 


- (0*tn - co) 


for t 0. We claim that the limit of this expression exists as t = 0. Indeed, if cd =f 
_ i s a funct ion, this limi t e xis ts at t = 0 a nd is jus t t he expressio n_ 

- D { /-limV/-/) - 

t - 0 t 

as we have seen. If to — d /, 

0*d/= d0*/ 


so 


_1_ 

n_i 


-(0fd/-d/)=d 

-( 0 f/- n 

* 

t 

[_t J 



Interchanging the limits involved in th e partial derivatives expressing d and D g 
legitimate and so we see that 


lim - (0* d/ - d/) = dDg/. 

t-*o t 

Now the most general linear differential form is a sum of expressions of the form / d g 
and 


0*(/dff) -/dg = (0 t */)0*dg ~/dg 

= 0* /0*dg —/0* dff +/ (0 f * dg) -/ dg 
so 

lim 1 (0,»/d g -/dg) = (D ; /)dg + /d(D { g ). 





Thus lim t=0 (l/£)(<jf>*fy — co) exists for any one-form. Furthermore, this limit can 
be expressed m terms of S, and co. We shall denote this limit by D*co. Now if 

co = co l a co 2 

where coj and co 2 are one-forms we have 

</> t *(C0i A ft> 2 ) ' CO I A CD 2 = </>,*«! A 0 t *CO 2 — CD X A CD 2 

= (0fcp! — CD^ A </ > * CP 2 + CD X A ( <ft *C0 2 — C0 2 )_ 

Dividing by t and passing to the limit, we see that 

lim ~((l)f(00 i A C0 2 ) — CDj a oo 2 ) 

,-o t 

exists and e quals 

D^COi A cd 2 + CC4 A D^cd 2 . 

As every two-form can be written as a sum of terms like 00 ^ a cp 2 , we see that 

lim-(<ft*co — co) 

- 1 -* 0 t - 

exists for any two-form co. We may call this limit co, as the limit depends only on ^ 
and co. Proceeding in this way, we see that this limit exists for all co and depends only 
on % and co. We denote it by D^co. It is called the Lie derivative of co with respect to 
The proof shows that, for any forms co x and co 2 , we have 






D^OOi A C0 2 ) = D^CUi A C0 2 + CO l A DsC 0 2 . 



The definition, and the fact that d<ft* = <ft*d, shows that 


dD^ co = D^d co. 

There is a formula which relates the interior product, /(£), the exterior derivative, 
d, and the Lie derivative, D. Before stating the formula, we make an observation. 
Recall that d maps /-forms into (/ + l)-forms and satisfies 

dfcp! A 00 2 ) = do?! a co 2 + (— l)*^ a dco 2 , 
if degree co^—k. also i(£) maps /-forms into (l — l)-forms and satisfies 
i(g)(a>i A C0 2 ) = i(^)C0 l A CD 2 + ( A i(%)c0 2 . 

Therefore, d°i(£) maps /-forms into /-forms and satisfies 

[d°/(g)](ci>i A 00 2 ) = d (Xg)a)i A CD 2 + ( l^C?! A i(g)ftl 2 ) 

’ = d^^COi A co 2 + (— l) fc_1 i(^)cu 1 A dco 2 

+ (— l^dcDi A i{%)co 2 + (- 1 ) 2k co l A d i(%)co 2 . 

^ow^(— l) 2fc - = 1. Also we have the corresponding formula 

[_i{ ^)°d'] {co 1 Aoo 2 ) = i(%)dco 1 a co 2 + (- 1 ) fc+1 dcu 1 a i(%)co 2 

+ (— l)*/^)^ A dcD 2 + COi A i(<fj)dc0 2 . 






If we add these two expressions, the middle terms cancel, and we have the derivation 


[i(£)°d + doi^)]^ a cw 2 ) = [i(£)d + di(£)]co l a^ + ^a [i(£)d + di(£)]cw 2 . 




D, = i(5)d + dim (17.2) 






Proof. Both sides of this equation are operators which satisfy the Leibnitz identity 
when acting on products. Every form is a sum of products of functions / and 


differentials d g. So we need only to verify the identity 

D^cw = i(%)dco + d i(£)cw 

for c o =f and for co = d/. For the case co =f i(%)f= 0 by convention. (There are no 
(— l)-forms.) The formula reduces to 

D,/=/(g)d/=d/(g), 

which we know from section 17.2. For the case co = d g. we have 

D,d g = d D t g 

=msm 

= (X$)d + di(£))d g 

since d 2 = 0. So the formula is true in general. We will give an alternative proof 
of a generalization of this formula in the appendix to this chapter. 

For example, suppose 

co = p dx a dy a dz 

in IR 3 . Then dew = 0, since there are no non-zero four-forms in three dimensions. 
Thus 

E)^ co = di(£)co. 

If we set 

J = i(4)co, 

then the condition 

dJ = 0 

is equivalent to 

D^cw = 0. 

This is the infinitesimal way of asserting that </>f cw = cw for all t. Thus Kirchhoff’s 
current law can be formulated as, say, that the flow </>, preserves the charge density 
cw in the sense that D^cw = 0. 


17.5. Magnetism 

In a very primitive form, certain manifestations of the phenomena of electricity 
and magnetism were known to the ancients. It is one of the wonders of history 



that the investigation of these two obscure effects, by scientists of the seventeenth 
and eighteenth centuries, led in the nin e teenth century to the understanding of 
the fundamental role that electromagnetic forces have in nature and to the 
revolutio nar y change in societ y brough t about by electrical technology. 
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that, when rubbed by cloth, it could make small bits of light material jump up 
and stick to it. The Greek word for amber is ‘elektron’. Gilbert, in his book On 
the Magnet, published in London in 1600, introduced the term ‘electricity’ to mean 
the property of attracting like amber. He devoted a whole long chapter to amber 
‘to show the nature of 
difference between this and the magnetic actions’ - to distinguish between the ‘pure 
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as similar phenomena giving a similar explanation for both. Cardano (1550) had 




Gilbert discovered that the electrostatic attractive property of amber could be 
reproduced in a number of other hard substances. Of course, the electrostatic 
attraction of a charged body for a light neutral body involving induced polar¬ 
ization and stronger attraction for the nearer opposite charge - is a difficult 
phenomenon both for precise theoretical computation and for experiment. The 
light bits of paper fly to the amber in a jerky irregular manner, stick to it, then 
fall off after a while, certainly ‘promiscuous’ behavior when compared to the 
steady, regular attraction between magnets ancTiron or magnets"for one another. 
In fact, it took nearlv two more centuries to realize that the fundamental attraction 


and to discover the correct law of force. This was done in careful experiments with 

the torsion balance by Coulomb in the 1780s. _ 

Ancient Greek writings referred to certain stones that had the property of 
attracting iron. A large such stone near the city of Magnesia in Asia Minor was 
reported to pull at the iron tips of shepherd’s staffs and the nails in their shoes. 
From this city’s name, we get the term magnet. It was discovered that this stone 


acquire a directional quality or ‘load’ from which the term lodestone was derive 






law.) However, our true understanding of the nature of magnetic forces had to 
await the work of Amper e. I t turned out that the corre c t o bje ct of study was not 
the attraction of magnets for iron or of magnets for one another. Rather it was 
the force that a magnet exerts on a moving charge (or, better yet, the force that 
one moving charge exerts on another). 

In the spring of 1820 while delivering a lecture on electric currents, Oersted 
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current. The announcement of this result - that an electric current has magnetic 
effects - astounded the scientific world. Electricity and magnetism, separated for 
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Thus i(y)B is a linear function on tangent vectors at p - and that is what we have 


subject to the force 


It follows immediately from the definition of the interior product that 

i(v)i — 

for any v, w and cm. In particular, 

i(v)i(v)co = 0, 

or, if a>eA 2 (V*), then 

[i(v)co](v) = 0. 



the wire. (This was verified by Ampere in a clever experiment involving a movable 
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At each point, the form B, if # 0, will determine a direction in space - the line 
given by the equation 

i(w)B = 0. 

(You should convince yourself that this really is a pair of linearly independent 
equations for w.) Iron filings placed in the magnetic field (free to rotate but not to 
move) will align themselves in these directions, producing the magnetic lines of 


lese are precisely me directions m wmcn a current wui ieei no iorce.j 
Ampere not only discovered the force that a magnetic field produces on a curren 


1- . . i -• i 




jl which, following Sommerfeld, we will call the magnetic excitation. The second 



j n other words, the integral of the magnetic excitation around any closed curve 
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By StoKes meorem, mis is me same as 

d H = AnJ. 

In contrast to electrostatics, we can formulate the next law of magnetostatics as 



‘There are no magnetic poles’ to quote Hertz.) Finally, we need a law relating B 
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(I> the force on a current element I = ev at a point p is given by i(DB. 


(ii) J s 5 = 0 for a closed surface S; so 

_c 

(in) l 8s H = 4n\ s J or 

-d i 
and 

(iv) B = g*H. 


dB — 0; 


Suppose that we are in a region U where H 2 (U) = 0. Then the condition dB = 0 


B = dA 


lor some one-form. 


course 


A — A x dx 4- A dy + A z dz 




the same equation. Let us choose ^ so that 


d 2 ip _ d 2 ip _ d 2 \j/ 

dx 2 dy 2 dz 2 


dx dy dz 


* The following computations take on a cleaner form once we introduce the general form of the* 
operator. See the next chapter. 



B = fi*H, 
d H = 4nJ 

reduce to solving three Poisson equations 

AA^ = — An J Y 


A/L = — 471J.. 


LH.fi Tini] l» 


se we wis. 


thin, straight wire carrying a steady current I. We may assume that the 













radius a and is along the z-axis. So J x = J y = 0 and 


J z =< 

Tl outside the wire, 

^ inside the wire. 


JLci ' 

Then A x = A y = 0 and we solve to get 

_ j _ a - -_ L-x 5 ■J \_ 


A z = - 2fil log j(x 2 + y 2 ) 


for points outside the wire. Thus 

dA, 


B = dA = — z dx a dz 


8A, 


dx 


dy 


d y a dz 


x 


— 2/iI — dx a dz H—- dy a dz 


w A A 

— -dr a dz 


in terms of cylindrical coordinates 

x = r cos 9, r — j(x 2 + y 2 ), 


y — r sin 6. 


z. 

The lines of force are circles centered at the wire, and H = B is proportional 

to dd. If we knew this fact in advance, i.e., that 




then we could conclude from symmetry that / = fir) and then from 

_'_ 

H = 4nl 
J y 

we could conclude that - 


f(r) 


21 


r 


Since dd/r — dr a dz we would then know that 

B = iu*H = 2\iT dr a dz. 

More generally, the solution of the Poisson equation shows that A is given by the 
volume int e gral 


= 



j x (w)d 3 w 


with similar formulas for A y and A z . 


formula 


of differential calculus 


We give an alternative proof of the basic formula (17.2). In fact, we will prove it 
in somewhat greater generality. Let W c [R m and Z <= IR" be open sets. Suppose 



that we are given a differentiable map <f>: Wx I-+Z, where I is some interval in 
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/ 

denote the tangent vector to this curve at </>(w, t). It is a tangent vector in the image 
space, R", but depending on t and w eW. We also let <f> t :W-+Z be the map given 
by 

&(w) = 0(w, t). 

We think if d>. as a one-parameter family of maps of W into Z. 


Let a be a differential ( k + l)-form on Z. For each t let us define the /c-form 
(j)*i{^ t )o on W by the formula 

i<t>t fai, ■ • •, = <*(&(*), d <f> t rh ,..., d<f> t rj k ). 


where w, 


are tangent vectors at w. All k + 1 vectors occurring i 
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makes sense. We take it as the definition of the left-hand side. In generalization 
of (17.2) we wish to prove the following: 

Let <7, be a smooth one-parameter family of forms on Z. Then <f>fa t is a 
smooth family of forms on W and the basic formula of the differential 
calculus of forms asserts that 


d da t 

d t <i> * 0t = ^*~di + 


't 


We first prove the formula in the special case where W = Z = M x I where M 
is an open subset of M^ and ^ is the map ^,:M x I ~+ M x / given by 

i^ t (x, s ) = ( x, s + t). 

The most general differential form on M x I can be written as 

ds a a + b, 

where a and b are forms on M that may depend on t and s. (In terms of local 
coordinates, s,x 1 ,...,x n , these forms are sums of terms that look like 

cdx 11 a • • • a dx tk , 

where c is a function of t, s , and x.) To show the dependence on x and s we shall 


o, = ds a a(x, s, t) dx + b(x, s, t) dx. 


^ith this notation it is clear that 


and therefore 


if/*o, = ds a a(x,s + 1, t)dx -I- b(x,s + f, t)dx 


= ds a — (x, s + t, t) dx + -^-(x, s + t, t ) dx 


+ ds a — (x, s + t,t)dx + — (x, s + t,t) dx 





S' 


do t = — as a d x a dx + ds a 

os 


^lcU,,= -d > «<lx + -dx 


nsr’ 1 "' 


u I •'j < 


Adding equations (17.4)-(17.6) proves (17.3) for ij / t . 

Now let (f>: W x/->Z be given by 0(w,s) = 0 s (w). Then the image under (j) of 
the lines parallel to I through w in W x I are just the curves </> s (w) in Z. In other 
words 

d( p(~-] = C t (w). 

V^Vfw.n 

If we let r. IT-> IT x / be given by i(w) = (w, 0), then we can write the map (f) t as 
4>°ij/ t o i. Thus 4>?o,= i*\j/f(f)*o t and, since i and do not vary with t, 

d d 

_ /r — ^ 


d r r r d t 

At the point w,t of Wx I, we have 


ii~U*o t = (dW 




and thus 


i — U*<x, ) = i*\Jjf(j)*U{L)o t ) = (j)f{i{Qo t ). 





Summary 

A Differential fo rms, vector fields and magnetism 

Given a vector field £ and a differential form ny you should know how to def ine and 

compute the Lie derivative 

You should be able to define and compute the interior product of a vector and 
a differential form and use it to formulate the magnetic force law. 

You should know the formulation of the laws of magnetostatics, and the 
technique for calculating B for a given current J, in terms of differential forms. 




17.1. Let { e ^ e ^ e a} be a basis for IR 3 so that 


dxfej = dy[e 2 ] - dz[e 3 ] = 1. 

Let 4 — * e 2 —yej. This vector field £ describes the velocity associated 


with uniform rotation about the z-axis. We can also denote it by £ = 
x(d/dy ) — y(d/dx). 

(a) For uniform charge density, co = p 0 dx a dy a dz. Using the relation 
J — determine the current associated with a rotating uniform 
distribution of charge. 

(b) Let co = zdx a dv. Confirm explicitly the identities D«m = ;‘(|)dm + 
_dz(£)co and dD.co = D.dco, where £ = . 


17.2. Suppose current / per unit length flows axially outward from the z-axis in 
such a way that the current passing through the wall of a cylinder of radius 
r centered on the z-axis is independent of rUn this case the velocity of 
charge carriers is described by 



Calculate the two - form J using J = i(4)dx a dy a dz and confirm that 
dJ = 0. — 

17.3. Suppose that B = B 0 dz, where B 0 is a constant. Find A satisfying the 
conditions 

dA x dA v dA z 

B = d A and —^ + —^ + — = 0- 

dx ay dz 

17.4. Suppose that within a wire of radius a there is an axial ly symmetric current 
J whose magnitude is proportional to r 2 . Explicitly, j x =j y = 0 and 

0 outside the wire, 

2I(x 2 + y 2 ) 

,4 inside the wire. 


Jz 


ncr 


(a) Find a furtction A z , depending only onr = ^ j(x 2 + y 2 ), such that 

A A z - - 4nj z . 





(b) Calculate B and H, in both cylindrical and Cartesian coordinates. 
Confirm that d H = AnJ. 

17.5. A very long cylinder of radius a , centered on the z-axis, has a uniform 
charge density p 0 . Outside the cylinder there is no charge. The cylinder is 
set into rotation about its axis with angular velocity co, so that the 
magnitude of J is corp 0 for r < a. 

(a) Express J in cylindrical coordinates and in Cartesian coordinates. (See 
Exercise 17.1.) 

(b) By symmetry, B is of the form B=f(r) dx a dy, and H = l/pf(r)dz. 
Determine H so that d H = AnJ. 

(c) Determine A both inside and outside the rotating cylinder. In 
cylindrical coordinates, A will be of the formgr(r)d0. 
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In this chapter, we discuss the star operator in general. This 
will allow us to formulate Maxwell’s equation in the next 
chapter and to understand the relation between Maxwell’s 
equations and the geometry of spacetime. It will also explain 
the various vector calculus operators and identities. 


18.1. Scalar products and exterior algebra 

Let Lbe a finite-dimensional vector space equipped with a non-degenerate scalar 
product. We do not assume that this scalar product, ( , ), is positive-definite. 
However, it might be good to keep the positive-definite case in mind to have an 
intuitive idea of what is going on. What we wish to explain in this section is that ( , ) 
induces a scalar product on each of the spaces A fe (K*). The idea is that the notion of 
length of a line segment implies an area for parallelograms, a volume for 

parall e lepipeds, etc. - 

We begin by p oint ing out that a scala r p roduct on V induc es a scalar product on 
V*. Indeed , a ny scalar product on V induces a map L: V-+ V* given by 

_ L(v)( w) = (v, w) v, we V. _ 

In this equation L(v) is an element of V*, i.e., a function on V, and so may be 
evaluated on any we V. The map L is defined so that this evaluation yields (v, w). Now 
if ( , ) is non-d e generate, then (v, w) = 0 for all w implies v = 0 . Hence L(v)(w) = 0 f° r 
all w implies v = 0 so L(v) = 0 implies v = 0. Since dim V* = dim V, we conclude that 
( > ) is nonsingular if and only if L is an isomorphism. Since we are assuming that 
( , ) is non -degenerate, we conclude that L is an isomorph ism. We can now define 
a scalar product on V* by 

(aj)v* = ( L -1 a, LT^P) V * aJe V* . _ 






for example, suppose that V= R" consists of column vectors, so V* = IRt"* consists 
0 f row vectors. The general scalar product on I R” is given by a symmetric matrix, S. 
Suppose S is diagonal, so 



if 



so 


fa fly* =Z ~ a A _ 

_ Si _ 

Notice that the s, now occur in the denominator. We must assume that no s t = 0. 
This is the non-degeneracy assumption on ( , ). 

If ( , ) is positive-definite and {e 1 ,...,e„} is an orthonormal basis of V, let 
{ef,..., e*} be the dual basis. An examination of the definition of L will show that 


L(e*) - ef 



in the case of Euclidean space IR", the basis one-forms dx 1 ,..., dx" are orthonormal 

with 


(dx‘, dx 1 ) = +1. 






We claim that there is a unique scalar product defined on A fc (F*) such that 

(a 1 a ••• a a^y 1 a ••• a /) = Det((a l ,y J )). 
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extends ( , ) by bilinearity to ail as ana ys winch are 
sums of decomposable elements, i.e., expressions like a 1 a ••• a « fc and y 1 a ••• a v k 




and/or y are written as sums of decomposable elements in two different ways, we 
shall get the same value for (a, y). This proof becomes quite straightforward once we 
use a slightly more abstract definition of the space A fc (F*) than we have been using so 
far. We give this definition and proof in the appendix to this chapter, so as not to 
interrupt the flow of discussion here. 

We now get a (function-valued) scalar product on the space of k-forms. For 
example, for two-forms, 

(CO 1 A CO 2 , T 1 A T 2 ) = (p) 1 , T 1 )(c0 2 , T 2 ) — (CO 1 , T 2 )(C0 2 , T 1 ). 

Thus, for example, in 1R 3 with its Euclidean scalar product, 

(dx a dy, dx a dy)= +1. 



r = y/(x 2 + y 2 + z 2 ), 9 = arctan((,y(x 2 + y 2 ))/z), cf) = arctan(y/x). 

If we calculate the differentials of these functions and express the results in terms of 
the basis (dx, dy, d zj with coefficients expressed for convenience in terms of r, 9, and 


), we fine 


dr = sin 9 cos 0 dx + sin 6 sin 0 dy + cos 6dz, 


= -(cos 9 cos 0 dx + cos 9 sin epay — sin 9 dz), 
r 


uu/ . I 0111 U/UA I U' V J. 

r sin 9 

Direct calculation, using the orthonormality of dx.dv. and dz, shows t 
and dej) are orthogonal elements of V* at each point (r, 9, </>), and that 

(dr,dr) = 1, (dfl, d 9) = 1/r 2 . (deb.d6) = 1 /r 2 sin 2 9. 



Finally, we can calculate the scalar product of the basis element dr a d0 a d$ with 
itself: 

(dr a d6 a d 4 >, dr a d0 a d</>) = 1/r 4 sin 2 0. 

The best way to summarize all of these results is to notice that 

dr, rd0, and rsin0d</> 

fArm an orthonormal basis for A 1 HR 3 *! at each noint so that we can calculate with 


these three differentials just as we do with dx, d y, and dz. 

18.2. The star operator 




an orientation of V. We recall that our definition there was as follows. Any two bases 




Otherwise, they determine opposite orientations. The set of all bases breaks up into 
uivalence classes, and each ec 
the space A"(F) we can put this somewhat differently. The basis {e 1 ,...,e„} 
determines a basis - simply a non-zero vector - in the one-dimensional vector 
space A"(F), namely the element 

e, a ••• a e„. 


-AseeondbasisrfjTTv.vf^determines-theelement-T-A-Af^Thesetwoelements- 

differ by the scalar multiple Det B. Thus a choice of orientation on F is the same as 
the choice of one of the two equivalence classes of non-zero elements of A"(F), when 
we regard two such elements as equivalent if they differ from one another by a 
positive multiple. 

Now suppose that Fhas a non-degenerate scalar product. Let {e l5 ...,e„) be an 
orthogonal basis of Fwith (e £ , = +1. Any two such bases differ from one another 
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together with the scalar product, determines a unique element, a • • • a e„ of A"(F). 


Suppose we are considering differential forms. Then at each point of F we get a 


>rds an n-lorm . lo 


cumbersome notation (and at the risk of some confusion), we shall also denote this n- 
form by a. Thus, if F= IR" is Euclidean n-space, then 

<7 = dx 1 a dx 2 a ••• a dx". 



We are now in a position to define a linear mapping from A k (V*) to A" _fc (F*) 
called th e star operator. To achieve this, we shall identify both A k (V*) and A”~ fc (F*) 
with the space of linear functions on A" -fc (F*), then identify elements of A fc (F*) 
and A n ~ k (V*) which correspond to the same linear function. This identification of 

A fc (F*) with A n ~ k {V*) will be c a lled the star operator .- 

We first show how the wedge product, together with our choice of <reA"(F*), 
assigns to each XgA\V*) a linear function on A n ~ k (V*). Indeed, if coeA K ~ k (F*) 
then X a co is an element of A"(F*). Hence it must be some multiple of o. In other 
words, we can write. 


Thus each X defines a linear function 


coi-> / co) 


Now there is a unique element of A" *(F*), which we shall denote *X, which 
determines the same function f(co) from A n ' k (V*) to IR via the scalar product: 


(*X, co) = /(co). 

If** n A - If 


9 vvv^ met j i 

1 ac 0 = (*X, co)ft 

for all coe A" _fc (F*). Notice that this definition of the star operator depends on the 
choice of orientation; reversing orientation changes the sign of ft and hence the sign 
of *X .— 

To calculate *X, we musTin general apply the above definition using each basis - 
element of A”~ fc (F*) in turn as co. I f , however, X i s a basis element of A fc (F*), of the 

which involves the n — k factors which do not occur in X. In this case X a co is the 
wedge product of dx 1 through dx” in some order; so it is + o. Then *X= + co. The 
only difficulty lies in determining the sign correctly. 

The general definition 

X A co =(*X,co)o 

needs to be supplemented slightly when k = n or k = 0. We denote the basis element 


/itn the scalar produc 


= 1 and the trivial wedge product 1 a 


co = co a 1 = co. Then, if A = cr, we need consider only the basis element co = 1, and 


ft A 1 = (*A, i)ft 


so that *cr = 1. On the other hand, if X = 1, we consider co = ft, so that 


and * 1 = (ft, ft)ft. This means that whenever we work with a Euclidean scalar 




(ft, ft) = — 1, so that 


The computation of the star operator is best illustrated by a few examples. 



The star operator 


643 


Example 1. U 2 with the Euclidean scalar product: basis for A 1 ^*) is dx,dy, with 
(dx, dx) = 1, (dy, d y) = 1 and o = dx a dy. 

*(dx Ady)=l (always true); 

dx a dy = (*dx,dy)dx a dy so ★dx = dy; 

dy a dx = (*dy, dx)dx a dy so ★dy = — dx; 

* 1 = dx a dy because (a, o)= + 1. 

Example 2. Two-dimensional spacetime, with the Lorentz scalar product: basis for 
A l (V) is (d t, dx}, with (d t, dt) = +1, (dx, dx) = — 1 and o = dt a dx. 

★dt a dx = 1; 

dt a dx = (*d£, dx)dt a dx so *dt = — dx; 

dx a dt = (★dx, dt)df a dx so *dx = — dt; 

★ 1 = —dt a dx because (<r,<r) = — 1. 


Example 3. R 3 with the Euclidean scalar product: basis for A X (E) is {dx,dy,dz}, 
wTt h (dx, dx) = (dy,dy) = (dz, dz) = 1 and cr = dx a d y a dz._ 



Example 4. Fo ur- dime nsional spacetime with the Lorentz scalar product: basis for 


A'jV) is 


1 z) with (dt,dt) = 1 , (dx, dx) = (dy, dy) = (dz,dz) — — 1 , and a 


— dt A dx A dy A dz. 


★(dt a 


a dz) = 1, 


Now 


(dt a dx a dy) a dz = (*(dt a dx a dy), dz)<r, 


so 


Similarly 


Also 


so 


★(dt a dx a dy) = — dz. 

★(dt a dx a dz) = + dy; 

★ (dt a dy a dr ) — — dx; 

(dx a dy a dz) a dt = (*(dx a dy a dz), dt)a; 


★(dx a dy a dz) = — dt; 




Now consider two-forms: 


(dr a dx) a (dy a dz) = (*(dr a dx), (dy a dz))a; 
so 

★(dr a dx) = dy a d z. 

Similarly 

★(dr a dy ) = — dx a d z,_ 

★(dr a dz) = dx a dy; 

(dx A d y) a (dr a dz) = (*(dx a dy), dr a dz)cr; 
so 

★(dx a dy) = — dr a dz. 

Similarly 

★(dx a dz) = dr a dy, 

*(dy a dz) = — dr a dx; 

Next consider one-forms 


dr a (dx a dy a dz) = (★ dr, dx a dy a dz)q-; 


so 

★dr = — dx a dy a dz; 

Also 

dx a (dr a dv a dz) = Udx. dr a dv a dz)<r; 
so 

★dx = — dr a dy a dz. 

Similarly 

★dy = dr a dx a dz, 

★d z = — d r a dx a dy. 

Finally 

★ 1 = — d r a dx a dy a dz. 


It is apparent that, in the above examples, ★(★!) = ±A. To discover the general 
rule, we choose an orthogonal basis e t ,..., e„ of V with (e f ,e £ ) = + 1. We consider 
the case where /leA ft is the product of k of the ef’s and a> is ± the product of 
the remaining in — k) e*’s. Suppose we have chosen X and co so that X a co = o (e.g., 
X = dx, co = dy a dz in IR 3 ). Then ★A = + co and *oj — ± X. To determine the signs, 
set *X = c x co and ★« = c 2 X. Now 

( 7 — X -f \ co = (*A, co)t7 


so 


(★A, cd) — c x (c 0,(0 ) - 1. 


Also, by the rule for interchanging factors in a wedge product, 


(~ 1)^" fc) <r = (O a X = {-km. AW 





so 

- (*m , A)-r 2 (A,^ = (-l) t "~ t) . 

But because A and co are basis elements, 

(A, A)(co, co) — (X A CO, X A ft)) — (ft, ft) 


so 

c l c 2 {a,a) = {-^)^\ 

Since (ft, o) = ± 1 we can move it to the other side of the equation. Thus 

★(★A) = C^CO = C ! c 2 A 

where 

CiC 2 = (-l) fc( "- fc V,<r). 


But by linearity of the * operator, if the above equations hold for each of the 
basis elements of A*(F*) they must hold for all of A k (F*). Thus we have proved 






★★ = (— '(cr, ft) on A*(F*) 



If n is odd and (<7,cr)= + 1, as for 1R 3 with the Euclidean scalar product, then 


★(★A) = A, but in general this is not the case. 

Using the result just derived, that 

we can derive a useful explicit expression for the scalar product of any two fc-forms. 
Notic e first that, because **go = + co, four succes sive app lications of the star 
operator yield the identity: i.e., 

★★★★co = co. 

By definition of the star operator, 

(★★★★A, ft))ft — **(*A a co) — (— k \o, ft)* A A ft) 
for any two k-forms A and co, so 

(A,co)ft = {— l) fc(n ~ fc) (q~,ft) * A a co 
= ft) a ★A(<t, ft). 

But 

★ ★★<7 = **(1) = (ft, ft)l. 

So by applying ★★★ to both sides we hav e 

(A, ft))(ft, ft)l = ★★★(&) a ★A)(ft, ft)l 


or 



(A, ft)) — ★★★(()) A *A) = :: ***(A A ★ft)). 




A similar useful result follows from 


(★A, ★ ft))ft = A A -kco 








18.3. The Dirichlet integral and the Laplacian 

^et co be a fc-form and X be a (k— l)-form. Then 

d(X a *co) = dlA *co + ( — l) fc_1 A a d(-*m). 

>o, for any (bounded) domain U we have 



/* 

dX A *CO + ( — 11 (,[_1) 

f* 

X A d*co = 

/* 

d(l A 

L_ 

«/ 

V _J 

V 

V 



If U is unbounded and if X or co have compact support (or if they vanish sufficiently 
rapidly at infinity), then by Stokes’ theorem 

' f 

d/l A *00 + (—l) fc_1 /l A d(*cu) = 0. 

JU J u 

Now applying * to the rule (p, t) = ***(p a *t), we see that 

d/l a *oj = (d/l, a))* 1. 

On the other hand, co is a /c-form so *co is an (n — /c)-form so d*co is an (n — k + 1)- 
form. Therefore 

X A d *C0 = (— l) (fc_1)( " _fc+1) d*0) A X 

= (- i)<*-n(«-* + n(»-id*cUk 
= (- l) (fc_ 1 )("~ k+1 ^X ,* - 1 d*co)tr. 

i 1 \ — C 1 








★ = **(*d*co) = (— l) (fc 1)( " k + 1) (a,cr)*d*co, 

a = (a, a)*\. 


Thus 


So we can write 

dX a *co + (— 


CU A ★<« + (- if A a*« = L(d/i,coj + (-tr ' V-,* a*c»jj*i. 

Thus, if X or co vanish at dU (if U is bounded) or have compact support (or vanish 
sufficiently ranidlv at infinity). 




Let us define the operator d* by 


d*co = (— l) k ***d*co = (— l) fc * d*co. 


(X, T)n = D r ,U, T) = (X, T)<7. 


n we have 


D v ( dX, a>) = D V (X, d*co) 

X or co vanishing at dU (or sufficiently rapidly at infinity). 
otice that 


□ = dd* + d*d. 


□d = dp. 


_ □* = *□• _ 

To see that Qd = do, observe that 

□d = (dd* + d*d)d = dd*d 
and 

dQ = d(dd* + d*d) = dd*d 


_ nd* = d*D. 

The proof that □★ = *□ is trickier, requiring careful i 


d*co = (— l)****d*cofor a k -lorm c o. 
On replacing the fc-form co by the (n — /c)-form *co, we have 

* d**co = (— l)" -fc ***d-**m 
Since d(**co) is a (k + l)-form, 










Since a) is a fc-form, 


★★ft) = (— l) fc( " a)co. 

Combining these results, and using {a, a) 2 = 1, we find 

d**co = t-l) n - fc (- + l) fc(n ~ fc) *da) 

or 

d** ft) = (-1 Y k + ■~ k ~ *>( -1 y k + 

or 

d**ft> = (— l)* - ^dft). 

With this rule and the rule *d*cu = (— l)*d*ft), the proof becomes easy: 

★ = ★d(d*m) + ★d*(dco) 

★ □rn = (— l) fc d**da) + ( — l) fc+1 d*dct) 

while 

□★m = dd**ct) + d*d-*qj 

□★o) = (— l) fc+1 d*dc) + (— l) fc d*d*G) 


so that 


★ □ = □★. 


The □ operator on [R 3 

We compute the Laplacian explicitly in [R 3 . For a function /,_ 

□/ = dd*/ + d*d/. 

But */is a three-form and dO = 0 for any three-form on 1R 3 . Thus □/ = d*d/. 
Now 




df , , df 

dx 


4r 


df 


dz 


and 


dx 


dy 


dz 


sn — 

★ — dx = dy a dz 


and so 


dx 


dx 


_ dx _ 


dx 2 


dx A dy A dz 


and 


\dx ) - dx ' dx 2 


Similar computations for the remaining two terms show that 


[—1 f— - 


d 2 f 

i 


□i 

\dx 2 




Now let us compute the Laplacian for one-forms: it is enough to compute □&) for 
ft) = adx. We can rotate to interchange any of the axes and □, being defined purely 



by the scalar product and orientation, will commute with all rotations. So, if we 
have a formula for D(adx), a similar formula will hold for □(bdy) and n(cdz). By 
linearity, we will then get a formula for adx + bdy + cdz. 

Now _ 

d(adx) = da a dx = — dy a dx + ~dz a dx. 

dy 8z 


So 



Hence, using the fact that ★ = ★ 1 for two-forms on IR 3 , 



Now 

d*(adx) = — *d*(adx) = — *d(ady a dz) 



□(adx) = (Da)dx 



/ d 2 a 

d 2 a 

, d 2 a\ , 


[dx 2 

h dy 2 

h fa* r 


mi 


r.Tsi 


mi 
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coefficients. 

Any two-form on IR 3 can be written as *ca where a> is a one-form. Since 
we see that, for 







A = A' - di ft. 


We shall have 


d A = B. 

Then, since ij/ is a function,_ 

= d w dil/~^d*A~ 
so 

d *A = d*A' - d*di// = i 

Thus our equations become 

d*dA = 4n j, 

or, since d* A = 0, 











Then 



Let us consider four-dimensional spacetime with the coordinates t , x, y, z and scalar 



So 


a 2 / a 2 / s 2 f a 2 / 

D/ ~ ^ai 2 + a? + ay 2 + a?' 

For one-forms, let us first consider adt. Then 





So 



On the other hand, 



= ★ *d(adx a dj; a dz) 



U(bdx) = (nb) dx, 


D(cd ^) = ( D c)dj ;, 
etc. 


"The same is true for two-forms. Consider, 


d*(adt a dx) = * *d*(adt a dx) 







SO 


*d(adt a dx) = — ^ dz + — dy, 




dz 


t _1 / _1 J _ 1 \_J 

f d 2 a d 2 a'\ { 

, d 2 a 

d*d(fldr a dx) = — j 

\dy 2 dz 2 J 

dv a dz — - , dx a dz 

- dy dx 


d 2 a 


dt Adz 


d 2 a 


dt a dy 


d 2 a 


dx a dy. 


dzdt 


dydt 


d*d(adt a dx) = — ★ M^dfadt a dx) 


dx dz 


Then 

and 



d 2 a 

. .. . 1 

d 2 a 

L. 

d 2 a d 2 a\ 

| j rCTyfi/ ' \ VIA J ^ | | H- J VJ V / \ VIA 1 

dt 2 

' 5x 2 



We will leave to you the similar verification that 

□ (bdx a dz) — (□ b)dx a dz. 


It then follows that for any two-form , expressed in terms of the dt, dx, dy, dz, 
applying □ is the same as applying □ to the coefficients. Having verified this for 
zero-, one-, and two-forms, it now follows for three- and four-forms from ★ □ = □*. 


18.5. The Clifford algebra 


By no w yo u- should be r easonably convinced that, in Euclidean coordinates on 
we can compute the operator □ applied to any fc-form by applying u to each of 
its coefficients. You might also be somewhat apprehensive that, if we try to prove 



up in a mess. The purpose of this section is to introduce an algebraic formalism 
that will allow a simple direct proof. We will not need the results of this section 
again in this book - we have already proved the results for the important cases 
of three and four dimensions. On the other hand, the ideas we will introduce here 
have proved to be important in modern physics. 

Our proof is going to make use directly of the property that 


D video, t) :: Dyjoo, d*i) + boundary terms, 

which we can phrase by saying that d* is the formal adjoint of d. The word formal 
refers to the fact that there are boundary terms: we only have 

D videos) = D v {oo, d*t) 

when various hypoth e s e s are mad e that guarant ee that there are no contributions 
from the boundary. Let us study this situation in slightly more generality. 

Let E and F be vector spaces, each equipped with its own scalar product. We 
ftiay then identify E with E* and F with F*. In particular, if A: E-+F is a linear 
ma p, then we may regard A*:F* -> E* as a linear map A*:F^E defined by 


(g, A* f) E = (Ae, f) F 















je star operator 


(b) i i >1. Then 


fii<n = dx x a dx,-, a • • • a dxi 


a •••Ad x k+1 if Ji > 


Then 


(e 1 co,T) = 0 = ^co,i^—J tJ if ji > 1- 


Oi a), t) = 0 = (^co, i ^—Jr 


unless j 2 — i± > J 3 — ^2? n Ji — *1> 73 ^2*•••> then 

(e^T) = (dx 1 ,dx 1 )(co, co) 


while 


( . 

( d \ 

C0,1 

LL) 

V 

\ ox i/ 


T = (CO, COj. 


This proves that 

ef = (dx„dx 1 )i(^-). 

map determinedbythescalarproduct on V. Let 0 be any one form and s(0) denote 
exterior multiplication by 9. Then 


Let us define 






We have thus shown that 


d * = - V z D —. 

p=l 0 X p 


Now it follows from the anticommutativity of exterior multiplication that 

SpS- + e s„ = 0 for all p and q 






ipiq + hh = 0 all P and q. 

If Pitq, then exterior multiplication by dx p and interior multiplication by i(d/dx q ) 


Indeed, it is enough to check this for p = 1, q = 2, and we leave this routine 
verification to you. On the other hand, for basis elements a>, 


_ !! 


[0 if co = dxj a •••, 

\^1/ 1 

[ CD if co = dx 2 A • • •, 


while 


., 7 a \ | 

CO if CD_=_dx 1 A • • •, 



\O x lJ 1 

^U 11 CO — O.X 2 A * • ■. 


Thus 


e(dx 1 ) + e(dx 1 )i 


= id. 


\ i-v—v L j l * 

\^ x iJ \dxi 

As there is nothing general about 1, we conclude that 


i p e p + e p i p = ± id, where + = sign(dx p , dx p ). 


Finally, since the operators i p and e q are all constant (in our coordinate description), 
we have 



tfrn 


[Vj 5 ] 

[Vo ] 


a + \ L 'SxJ 




P K) 



Z (Vg + *<? £ p) 


d 2 


PA 


dx p dx q 


d 2 

- ^ ~dx 2 p 

This completes our proof of the formula for fl in Euclidean coordinates. 

For the case of positive - definite scalar product, the operators s p and i q have an 
important significance in quantum physics where they are known as the creation 
and annihilation operators for fermions. The relations 


s p e q + e q s p = 0, i p i q + i q i p = 0, 


0 ifp^g, 




are known as the anticommutation relations for these operators. 

In mathematics, the algebra generated by these operations - that is, the set of 
all sums and products of the es and ;s - is a special case of a family of algebras 
constructed by Clifford in th e last c e ntury. These algebras - i.e., sets of objects in 
which addition and multiplication are defined - were created by Clifford as a 
generalization of the complex number system and of the quaternions of Hamilton. 
As these algebras play <an important role in modern physics, we take this space 
here to describe them. 

We begin by reformulating the preceding example of the anticommutation 
relations of the creation and annihilation operators for fermions. Let V be a vector 
space, and V* its dual space. Let us consider the direct sum space W= F©F*. 


Thus a vector of W is a pair w = 


v 


where veF and aeV*. Let us define a scalar 





The star operator 


o ' 


product on W by 


(w,w') = i(a(v') + a'(v)) 


if 




a 


and w' 


a 


Let vv„ be a basis of V and let a 1 ,..., a" be the dual basis of V* . Then the vectors 


0 


0 


0 


0 /’ 


a 1 /’ \ a" 


form a basis of W. Let us call these vectors by th e nam e s c p and i q . That is, we define: 


0 


o r 


or 


Thus 


( e p> e q ) = 0, ( i p , > q) = 0 , 


(e p , i„) = 


2 if P = 

0 if p # g. 


We can now write the anticommutation relations for the es and is in a succinct 
way. Let us think of the e p as being associated to the e p and the i q as being associated 
to the i q . That is, we consider the linear map y which is defined by 


7(e p ) = s p , 


7(U = V 


Then, for 




i c i 


^e„ + oc 1 i 1 


ci n i 


y(w) = v l s 1 H-1- v n s„ + tx 1 i 1 -\ -b a n i n 

is an element of our algebra and 


y(w)y(w') + y(w')y(w) = 2(w, w')id. 

This can be generalized to any vector space W with a (possibly degenerate! scalar 
product: To each such vector space W with scalar product ( , ), we associate an 
associative algebra + C(W ) called the Clifford algebra of W and ( , ). There should be 
a linear map y: W^C{W) such that 

(i) Every element of C(W ) can be written as a sum of products of the elements 
y{W) (and of multiples of II) 

(ii) y(w)y(w') + y(w')y(w) = 2(w,w')1I 

(When we use the phrase the Clifford algebra, this tacitly assumes a theorem - that 
given W and ( , ) there exists a C{W) satisfying (i) and (ii) and that C(W) is unique, 


+ By an associative algebra A we mean that A is a vector space with a bilinear map A x A->A 
called multiplication. One assumes that this multiplication satisfies the associative law and 
that there exists an identity. H. for multiplication. 



u p to isomorphism. We briefly sketch the proof of this theorem in the appendix 
t0 this chapter.) 

Let us work out some examples: 

(a) Take W to be one-dimensional with a negative-definite scalar product. So 

there is a vector e with_ 

(e,e) = - 1 

and every element of V is a multiple of e. Condition (ii) says that 

y(e) 2 = - 1 

and condition (i) then says that every element of C(W) can be written as 

all + by(e). 

If we call y(e) = i, we see that C{W) is precisely the algebra of complex numbers. 

(b) Take V = IR 2 with the negative of the usual scalar product. So 

(v,v) = — x 2 — y 2 if v = 

-Set- 

_ i = y (o) and j = _ 

Then condition (ii) says that 

i 2 = — U 


and 

ij+ji = 0- 

Condition (i) then implies that every element of C(W) can be written as 

all + bi + cj + dk 


where we have set 


ie star opera or 


To summarize: 

_ i 2 =j 2 =k 2 = -D. _ 

ij = k, jk = i, ki = j, 

ij = — jk=— kj, ik=—ki. 

This algebra is known as the algebra of quaternions. It was first discovered by 
Hamilton. 

In general, in checking condition (ii), it is enough t o check tha t 

y(w) 2 = (w, w)1I. 

Indeed, 

y(w + w') 2 = [y(w) + y(w')] 2 = y(w) 2 + y(w)y(w') + y(w')y(w) + y(w') 2 

and 

y(w + w') 2 = (w + w', w + w')1I 

= (w, w)1 + 2(w, w')H + (w', w')11 


so we get condition (ii). 

With this in mind, let us take W = IR 1,3 , that is, W is spacetime with its Lorentz 
metric. Set 


( t \ 


w = 


u 


y 

W 


( 


and y(w) = 


0 

0 


0 

- 0 - 


t — z — x — ly 
V — x —|— i y t + z 


t + z x + iy\ 
x — \y l — z 

0 0 / 


Th e n 


y(w) 2 = it 2 - x 2 - y 2 - z 2 ) 


(l 0 0 0 \ 
0 10 0 
0 0 10 
\0 0 0 l) 


So if we call the 4 x 4 identity matrix 1, then we see that (ii) is satisfied. We can, 
in fact, regard C(W ) as a certain subalgebra of the algebra of 4 x 4 complex 
matrices. The algebra C{W) is called the Dirac algebra. It was developed by Dirac 
for his study of the electron and plays a central role in the modern theory of 
elementary particles. 

(c) Take W to be any vector space and ( , ) to be identically zero. Then (ii) 
says that 

y(w)y(w')= — y(w')y(w) 

for any w and w'. In this case, C(W ) is exactly the exterior algebra, A(JT). 


18.6. The star operator and geometry 

We saw in Chapt e r 16 that the star operator going from A 1 ^ 3 ) to A 2 (1R 3 ) deter- 
mines the Euclidean scalar product on U 3 . This is the mathematical expression of 




the fact that dielectric properties of the vacuum determine the Euclidean geometry of 
space. 

We can thus pose the following question. Let V be a vector space with orientation 
and let ( , ) and ( , )' be two non-degenerate scalar products on V. Suppose that 

cr\mp 1 <C k n = Him 1/ mtro ri e t-/-» tVip cnrvie St ar ^>i~>pratnr fmm 


to A"" fc (F*). Does this imply that (,) = (,)'? The answer is ‘almost’. Here are 
some examples. 

Take V = IR 2 with its usual Euclidean scalar product. We can then identify 
(IR 2 *) with IR 2 . The star operator ★: A 1 (IR 2 *) -*■ A 1 (IR 2 *) is, as we have seen, in this 


given by 


for some symmetric matrix A. If ( , )' determines the same operator, we must have 



But if RAR 1 =A for some rotation R (other than through 0° or 180°), then A 
must be a scalar multiple of the identity. Thus 

(u,v)' = A(u,v) 

for some non- zero number X. 

In fact, as w e shall ch e ck in a moment r any such ( — )- with -/, >0-determines— 
the same * operator. Thus in the plane, the * operator does not determine the 
Euclidean geometry, but it does determine the conformal geometry of the plane. 
This fact lies at the basis of the theory of functions of a complex variable. We 
shall present an introduction to this subject in Chapter 20. 


Let us see, in general, what the effect of a scale transformation - replacing v by 
cv for some non-zero number c - has on the * operator. Multiplying lengths by c 



or, to get the same linear function of co, we must have 


„ 2 k — n _ 


For k = n/2 a scale transformation has no effect on *: A fc (F*)-> A fc (F*). 
It is not hard to prove - much as the proof we gave in Chapter 16 for *: A 1 (IR 3 *) 
A 2 ((R 3 *) - that in all other cases the star operator determines the scalar product. 


We now turn to another important point that we can only discuss briefly. Recall 






gave rise to the theory of general relativity of Einstein. The theory of the star 


18.7. The star operator and vector calculus 

The existence of a scalar product establishes, as usual, a correspondence between 
the space V and its dual V* = A^F), which we can use to identify a differential 
form with a vector field. As a notational convenience, we shall identify the vector 
Held associated with a differential form by writing an arrow over the symbol for 
the form: thus, if A is a one-form, A is the vector field related to it by 

ca[A~\ ={A,oj) for all one-forms co. 


Thus if 


the associated one-form is 


A — A^ y T A2&2 T 


where for each term the + sign is chosen according to whether (e.-.e.) = + 1. In 


A = /Le, + y4..e„ +/Le. 


A = A x dx + A dy + A z dz. 


A = A t e t + A x e x + A y e y + A z e z , 


lx — A y dy — A z dz. 


X 





operator and the d operator. Then the dot product may be written 

A B = *(A a *B ) = *(B a *A) 

while the cross product is 

A xB — *{A a B)! 

Xhe differential operators are 

grad /= df, 
curl A = *dA, 
div^l — * d * .4. 


In coordinates other than Cartesian, the association between a vector A and a 
diffe r ential form A is most conv e ni e ntly mad e by first constructing an o r thonormal 
basis of one-forms at each point. For example, in cylindrical coordinates, dr,rdtf, 
and dz are orthonormal one-forms, dual to the (unit) basis vectors e r< e e . and e_. Thus 

A = A r e r + A e e e + A z e z 
is associat e d with the one-form 

A = A r dr + A 0 (r d0) + A z dz. 


Similarly, in spherical coordinates, dr, rdfl, and r sin fld</> are orthonormal, so 

A = A r t r + A e t 9 + Afa 

is associated with 


A — A r dr + A e rd9 + A^r sin 9 d$. 

It now becomes a simple matter to compute div, grad, and curl in cylindrical 
or spherical coordinates. The rule for applying d is the same in any coordinate 
system, while the star operator acts on the orthonormal basis {dr,rd0,dz} or 
{dr, r d9, r sin 9 d(f)} exactly as it does on {dx,dy,dz}. For example, we compute 
curl A in cylindrical coordinates, using curl A = *d A: 

A = A r e r + A 0 e 0 + A z e z , 

A = A r dr + rA e d9 4- A z dz, 
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'd(rA e ) 8A r \ 1 


- (dr a rd6), 


dr 89 Jr 

' 8A Z d(M 0 )\l (8A r 3A 


so th a t, on replacing dr by e y , r d 9 by e 9 , dz by e 2 , 


a/i r \ i 


-dz, 


dr 


80 J r 


8A Z d(rX„)\l A (8A r 8A Z \ A 

curU= b ^~^ar 


(!(rA e ) _<TAA]_ 
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664 The star operator 

Similarly, we compute in spherical coordinates the divergence of a radial vector 
field using A = *d*/4: 

A = A£— 

A = A r dr. 


rt(r 2 A 1 1 8(r 2 A 1 

d* A =-— sin ddr a dd a dd) =— —-—— dr a r d6 a r sin 6 d<6. 

dr r dr 

1 d(r 2 A r ) . — 

★d *A=-j =diV/4. 

r or 

In both examples, the secret is to express all differential forms in terms of ortho- 
normal one-forms before applying the star operator. 


The notion of the tensor product of two (or more) vector spaces is essential to 






spaces, but, as it does not appear in many of the elementary linear algebra texts, 
we give a brief introduction to the subject here. 

Lety-andlT— beveGtor-spaces,-and-let-(/— be-a-third-vector-space.A-map 
f:V x U is called bilinear if 

/(v i + v 2 , w) =/(v t , w) + /(v - 2 - , w), 

/(V, Wj + w 2 ) =/(v, Wi) +/(v, w 2 ), 

and _ 

/(«v, w) = a/(v, w) =/(v, aw) for scalar a. 


A familiar example, where V= W= U, is the vector product in ordinary three- 

dime nsional space._ 

Starting with V and W we wish to construct a vector space Z and a bilinear 
map b:Vx W-^Z which is universal in the following sense. Suppose that f'-V* 
W-+ U is any bilinear map. Then there exists a unique linear map l f :Z-*U such that 

Diagrammatically, 


VXW- 



for any / there is a unique linear l f making the diagram commute i.e. l f °b — J- 

Notice that ifb and Z exist, they are unique up to isomorphism. Indeed, suppose 

■" in 


Appendix 


bb‘. 


the above diagram, we get 


Y x W - - Z 


\ A 

z' 


SO l b >:Z-*Z r with 

b' - l b °b. 

Similarly we get a linear map l b :Z'-*Z with 


h = Lob'. 


But then 


b = (l b °l b )°b. 


But 



then id and l b °l b , both satisfy the equation 

b = lb. 

By the uniqueness, we conclude that 

l h °l b ' = id; 

and similarly 

lb'°lb = id. 

Thus l b , gives an isomorphism of Z with Z' and b' — l b '°b. So, up to isomorphism, 
{b,Z) is unique. 

Our problem is thus to show that such a ( b,Z ) actually exists. We shall give 
two rather different looking constructions. Of course, we know by uniqueness that 
they must give the same final answer. 

Let Y denote the space of all scalar-valued bilinear functions on V x W. Thus 
ae T is a bilinear map of V x W -+ IR. It is clear that the set of all bilinear functions, 
<*, forms a vector space and this is Y. Now take 

Z=Y*. 

So a vector in Z is a linear function on Y. Define the map b: Vx W-+Z by 

[b(v, w)](a) — a(v, w). 

^hat is, h(y, w) is that linear function that assigns to each bilinear function, a, the 
y alue «(v, w). It is clear that b(v, w) is a linear function of « and that it depends 
bilinearly on v and w. We claim that b and Z give a solution to our universal 

Problem. 





ie star opera or 


Indeed, suppose that f:Vx W-+ U is a bilinear map, where U is some vector 
space. For each veU* we get a bilinear function 

-v°/e7,- 

We have thus defined a linear map, /*: U*-*- Y: 

l*(v) = v°f. 

Therefore, we get its adjoint, l**:Z = 7*-►[/**. By definition, 

, w))(v) = h(v, w)(/*(v)) 

= (/*(v))(v,w) 

= (v°/)(v,w) 

= v(/(v, w)). 

In other words, f**(fr(v,w)) is that linear function on U* which assigns to e ach v 
the number v(/(v, w)). Now recall that there is a canonical isomorphism 

U^U** 


where e a ch ueU is identified with the line a r function i( u) on U* given by 

i(u)(v) = v(u) u eU,veU*. 

We can thus write 

/**(fc(v,w)) = i(/(v,w)). 

Using the identification i, we can write /** = i°/, where / is then defined as a map 
of Z-* U. Thus we have found a map /: Z -> 1/ satisfying 

Suppose there were two such maps. Then the difference, m, would imply 

(m°b)(v, w) = 0 for all v, we V. 

So 

m*(v)(fr(v, w)) = 0 for all veU*. 

Thus m*(v)e 7 is that bilinear function on Vx W which assigns 0 to all pairs (v, w). 
Thus m*(v) is the zero bilinear function. Thus m* sends all of U* into 0 so is the 
zero linear map. Thus m = 0. This proves the uniqueness of /. 

The space Z is called the tensor product of the spaces V and W and is denoted 
by V (x) W. We shall also use th e notation 

b(\, w) = v (g) w. 

Suppose that V and W are finite-dimensional. Let {e 1 ,...,e m } be a basis of ^ 
and {f l5 ..., f„} be a basis of W. Then a bilinear function (3 on F(g) W is completely 
determined by its values fy): we can write 

Piffle* tfrij 


ppent ix 


where e,j is the bilinear function determined by 

£; ,(v, w) = VjW, if v = Vfit + • • • + V m e m , w = w, f, + ••• + w H f„. 

The fiy are clearly linearly independent and span Y\ they form a basis of Y. The 
dual basis of Z = Y* is j ust b {e h f,) = e f ® fThus, 

If {e 1> ..., e m } is a basis of V and is a basis of W , then 

,mj=i, ,n gives a basis in F® VF In particular 

dim(F® W) = (dim V) x (dim IF). 

Here is another construction of F® W. It is more direct - and abstract - but 
has the disadvantage of invoking some infinite-dimensional spaces. For any set 
M, let F{M) denote the vector space of all formal linear combination of elements 
o f M Thus an element of F(M) is a finite expression of the form 


a m l m 1 + —h a rn k m k 


scalars. We add two such expressions 
by adding the coefficients. If M were a finite set, F(M) would simply be the space 
of all functions on M, and the function / corresponds to the formal expression 


Z fWm. 

meM 


If M is not finite, then F(M) can be thought of as the space of all functions on M 
which vanish exc e pt on a finit e numbe r of points. (He r e we a r e identifying meM 
with the function b m where bjn) = 0 if nem and d m (m) = 1. Thus the most general 
element of F(M) is a finite linear combination of the S m . The map M -> F(M) sending 
vn — * <5 m gives M as a subspace of F(M).) 

The space F(M ) is universal with respect to maps of M into vector spaces: 
given any map <, where U is a vector space, there is a unique linear map, 
L (j) :F(M)-+U such that 

Indeed, this last equation defines on the elements S m and hence extends by 
linearity to all of F(M). 

Now let us get back to our problem. Take M = V x W. If $ is any map of the 
set V x IF -> U, there is an L^: F(V x W) -> U which is linear and satisfies L (j) o 5 m = 
</>(m) for all meV x W. If/: V x U is bilinear, then L f must vanish on all the 
elements 

< ^(v 1 +v 2 ,w) Avj.w) 

+W 2 ) ^(v,w 2 )’ 

_ ^(rv.w) ^(v,w)’ _ 

- ^(v.nv) ~ r ^(v,w)~ - 

Thus, L f must vanish on the subspace B spanned by these elements. We can thus 
define F® W to be the quotient space _ 

V®W=F{Vx W)/B ‘ 



e star operator 


and the map b: V x W -> F® W by 


ns definition 


b(v, w) = f <5., 


of b and F® W fulfills the universal properties. 
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be another pair of vector spaces. Let A: F-> V and B : W-* W' be linear trans- 
formations. Consider the map sending (v, w) into A\ ® By/. This is clearly bilinear 
Hence there exists a unique linear map from F® IF-* F'® IF', which we shall 
denote by A®B, such that 

(A ® B)(\ ® w) = A\ ® By/. 

Suppose that A': V'-> V" and B’:W -> W" are a second pair of vector spaces and 
linear transformations. Then the man sending tv. wl into A'Av6dB'Bw gives 


A'A ® B'B: F® V" ® W". 


A®B: F® W-+ F'® W' 
A' ® B’: V'®W’^ V" ® W" 


so we get 


(/T® BJ(A ® B ): V® IF-> F " ® W". 

By construction f4'® B')(/lv® Bw) — AA'\®BB' w. Hence we conclude (from the 
uniqueness part of the universal property) that 

(A' ® By (A ® B) = A A ® B’B. 

products. Suppose 


that F and W are equipped with scalar products, ( , ) v and { , ) w . Consider the 
scalar valued function / defined on (F x W) x (F x IF) by 


^vVw.wV 


* or lixed v 


tion ..a on 


For fixed v and w, this expression is bilinear in v' and w'. Since every oce F® W 
is a finite sum of elements of the form v ® w, we conclude that 

J v >'( a ) 

is bilinear in v' and w', for any fixed a in F ® W. Thus there is an linear function 
F on F ® IF such that 

/ a (v'®w') = / v ». 


is linear in a and (anti-)linear in B. It is easy to chec k that 


(MW=W 

defines a scalar product on T® W. To summarize: 

There is a unique scalar product ( , ) F(g) w defined on V (x) W which has the 
prop e rty that 

(v (X) w, v' ® w') F(g)W ,= (v, v')k(w, w'V- 


Tensor products and Horn 

Here is one further identification involving tensor products which is very useful. 
t c t V and W be vector spaces. We define the vector space Horn (IT, V ) to be the 
space of all linear transformations from W to V. This is a vector space (by the 
addition of linear transformations) where dimension is (dim IT)(dim V). For each 
veF and fieW*, consider the rank 1 linear transformation T(':IT-> V define by 

'Ty(w) = <^,w)v. 


Here {fi, w> denotes the value of the linear function fitW* on the vector we IT. 
The map Tt clearly depends linearly on v for fixed fi and linearly on fi for fixed 
v. Thus we have a unique linear map 


i : V® IT*-►Horn (IT, V) 

determined by 

i(\® fi)w — {fi, w>v. 


It is easy to check that the map i is an injection. It maps F® IT* onto the subspace 
of Horn (IT, V) consisting of those linear transformations of finite rank. If W or V 
is finite-dimensional, this is all of Horn (IT, V ). If V and IT are both infinite- 
dimensional, this is a proper subspace. 


Higher order tensor products 

There are a number of extensions and modifications of the notion of tensor 
product which we now briefly describe. Suppose that instead of just two vector 
spaces V and IT w e had k vector spaces T 1? ..., V k . We can then define a /e-linear 
(or multilinear) map /: V x x • • • x V k -+U to be a map which is linear in any one 
of the variables when all the others are held fixed. Then there is a universal space 
V\ ® • ■ • (g) V k and multilinear map m: x • • • x V k -> V x (x) V 2 ® • • • ® F fe just as in 
the other case. It follows from the universal properties that there is an isomorphism 
of 

(T, ® ••• ® FJ(x)(T fc+ ! ® ••• (x) T fe + «) 

with 

Fi® V k+l 

since they both satisfy the universal property for (k + /)-linear maps. 

As in the case k = 2, if we are given linear maps A ] : V 1 -* W u A 2 '■ T 2 -» W^tc., 
w e get a linear map 

A) ® ••• ® ••• ® V u -* W x ® W k . 

The generalization of the rule for composition holds as well. Also, under the 





identification of (T^ <g) • • • ® V k )® (V k+ 1 <g> • • • (g) V k+; ) with V t 


k + h We 


Now suppose that V 1 = V 2 = ■■■ — V k are the same vector space. We say that a 
multilinear map f:Vx • • • x V (k times) is antisymmetric if 

/(Vi,.... v„ v { +v k ) = -/(v i,..., v £ + lf v £ ,..., v fc ) 




/(v^i), •.., v^fcj) = (sign , v k ). 

We can now look for a universal space and antisymmetric multilinear map. That 
is, we look for a vector space A fc (F) and an antisymmetric multilinear map 






livuiiMjiiiiunmaBMiumi 


IUsiMbiihTi 


:Fx • •• x V-*U there is a unique linear map l f 


such that hm k =f. 


■ A k (V) 



It follows immediately that, if (m k ,A\V)) exist, they are unique up to isomor- 
phism. Also, an examination of either of our two proofs of the existence of the 
tensor products leads, with minor modifications, to a proof of the existenc e of 
A fe (K) and m k . In particular, the first proof shows the following: 

Let V* be the dual space of V. Then A fc (K fe ) can be identified with the space 
of all k-linear antisymmetric functions onFx-xK 


Exterior algebra 

Consider the map m k+l :V x 


■ • x V(k + l times) into A k + \V). It is multilinear and 




map ol 


known as exterior multiplication, and denoted by a . From the universal properties, 


(to A ff) A t = CO A (ff A t) 

as maps into A k + l+p (V) where coeA fe (F), creA^Fl and teA p (V). Also, by evaluating 
on basis elements, it can be checked that 




j s multilinear and antisymmetric in v l5 ...,v k and is multilinear and anti- 
gymrnetric in the w t ,..., w fc . Thu s, just a s we argued for the i nduced s calar product 
0 n V® W, it follows from the universal properties that / induces a scalar product 
/\ k {V). In other words , we get a unique scalar product ( , ) Ak(n which is 


(Vi A -• A Vfc.Wi A — A w k ) A * w = Det((v„w J ) F ). 


we have studied in this chapter, such as the exterior algebra and the Clifford algebra. 

Herf some of the details. A vector space A is ca ll ed an algebra if there is 
given a map m (called multiplication) 

m:A®A —* A. 


fc shall denot 


sociative 


for all a, h, and c. It is said to have a unit (denoted by \ A or just 1 if there is no risk of 


conlus 


for all elements of A. Unless otherwise specified, all algebras will be assumed 


L e t V b e a v e ctor spac e . Consid e r th e following universal probl e m: to find an 
algebra U and a linear map i:V->U such that for any algebra A and any linear 
map /: V->A there exists a unique homomorphism 4>:U ->A such that 

QO-u) = 1 A 

and _ 

(j>(iv) —f{v) for all veV. 



(Here the word homomorphism means that d) is a linear map satisfying (j)(ab ) - 


' ' • \ / / —--- ^ A \ * / X. A 

by the same arguments that apply to all universal constructs. (They are really 
arguments that belong to category theory.) The problem, as usual, is to construct 
°ne such algebra. Consider the (infinite) direct sum 



to be the unique map given by 

m[(v x 0---(E)v p )^)(w 1 ®=v 1 ^)---(8)v p ®w 1 ® •••(g)w 9 

(just remove the parentheses). Multilinearity guarantees that this is well defined 
So we know how to multiply a p with b q where 

a p eT p (V) = 

and 

b q eTJLV). 

(We define r-a = ra for reIR). Since every element of 

00 

T(V) = 0 T r (V) 

r — 0 

is a finite sum of a = a 0 + + —I- a r of elements a p eT p (V), the distributive law 

(bilinearity) then determines multiplication in T(V). Take U = T{V) and let 
i : V-> T(V) be the map which simply identifies V as the r = 1 piece, T^V), of the 
direct sum, T(V). 

Now let /: V->A be any linear map. Then 

(multiplication in A) is bilinear in v 1 and v 2 , and hence defines a map 

_ 4> 2 :V®V-+A. _ 

Similarly - 

<t>r--T r (V)-+A 

is uniquely defined by 

</>r(v 1 <B> • • ■ <8>V r ) = /(Vi) • • • /(v,). 

Finally set 

0( a O + ‘ ‘' + a r) = ^o( fl o) + ‘' • + 

(where (t) n (cir i) :: fln’ll It is easy to check that 0 is a homomorphism and is the 
unique one which satisfies 

0(tv) = /(v) for all veK 

Thus T(V), i is the ‘universal’ algebra over V. 

If A is an algebra, a subspace / A is called a (two-sided) ideal if 

aeA, bel implies abel and bael. 

We can th e n d e fine multiplication on th e quoti e nt space 

B= A/1 
by 

[a/7 J • [a'//] = \ad/l\ 

where w e have used the notation [a//] to denote the equivalence class of a mod/. 
The point is that the above rule is well defined - it is independent of the particular 
choice of a or d mod /. 



-fhe Clifford algebra 

\Ve can now use this notion of quotient alg e bra to solve other ‘universal cons - 
truction’ problems. For example, let us consider Clifford algebras. Let V be a 
e( .for space with a quadratic form Q, and associated scalar product ( , ). Let A be 
an associati ve algebra w ith unit \ A . A Clifford map is a linear map 

_ f:V-+A _ 

such that - 

/(v) = Q(v)l A for all veV. 


We can equally write this condition as 

/(u)/(v)+/(v)/(u) = 2(u,v)l x . 


The Clifford algebra over V is an algebra C(V,Q) together with a Clifford map 
j. y.+ c(V, Q) which is universal in the usual sense: given any Clifford map f V-+ A 
th e r e is a unique homomorphism 4> : C(V,Q)^A such that / = 


v ——*■ C(V, Q) 



As usual, if a C( V, Q ) exists, it must be unique up to isomorphism. The problem 
is to construct one. The universal property of the tensor algebra forces our hand. 
If the re is to be such a map j of V i nto t he algebr a C(F, Q ), we must have a un ique 
homomorphism 0: T{V)-*C{V,Q) such that 


V —■— +-T(V) 



Still assuming for the moment that C(V,Q) and; exist, let J cz T(V) be the set of 
all b in T(V) such that 


xl/(b) = 0. 

Then J is an ideal, and we would get an isomorphism of algebras 

T(V)/J^C(V,Q) 


induced from i//: 

i^[a/J] = \j/(a) (independent of the choice of a in [a/J]). 

Now the e l e ments 

v®v-6(v)l 

T(V) ■ 

would certainly have to lie in J by the defining property of Clifford maps. Hence 
J would have to contain all sums of right and left multiples of these elements. So 






let us start afresh. Let / be the ideal in T{V) generated by the elements 
v® v— Q(y)l T (vy That is I consists of all (finite) sums of expressions of th e form 

a-(\®\ - Q(\)l T{V ))b 

where a and b range over T(V) and v over V. Then I is an ideal. Define 

C{V,Q) = T(V)/I 

and 

j:V->C(V,Q) j(v) = [i(y)/n 

= [v//]- 

By construction j is a Clifford map and we leave it to you to check that all the 
universal properties are satisfied. In the special case that Q = 0, the Clifford 
condition becomes 

/(u)/(v)=-/(v)/(u). 

Tn this case (as you should check), the Clifford algebra is exactly the exter ior 
algebra A(L). A detailed analysis of the structure of Clifford algebra is given at 
the end of the Exercises to this chapter. 


Summary 

_A_Scalar products and the star operator 

Given an orthonormal basis for A^-F*), you should be able to compute scalar 
products on A ft (F*). 

Given a vector space with a scalar product (not necessarily positive-definite) 
and an orientation, you should be able to use the definition of the star operator 
to compute * X for an arbitrary basis k -form X. 

You should be able to define the Laplace operator and express it in terms of 
partial derivatives with respect to Cartesian coordinates. 

B Vector calculus 

You should be able to define div, grad and curl in terms of d and * and to use 
these definitions to prove identities of vector calculus or to express differentiaT 
operators in orthogonal coordinate systems. 


Exercises 

18.1. Two-dimensional spacetime has affine coordinates t and x, with scalar 
product (e t ,e t )= + l, (e^ej = — 1, (e t ,e x ) = 0. The basis two-form is 
dr a dx. 

(a) Using the definition of the star operator, 

X A <X> = (*L O))o 

calculate *1, *dt, *dx, and *(dt a dx). 


(b) Another pair of affine coordinate functions on this space is 

U = t — X,V = t + X. 

Calculate du, dv, *du, *du. 

(c) Let f be a twice-differentiable function on two-dimensional spacetime. 
Calculate the Laplacian of /, 

□ /= — d*d/ 

in terms of partial derivatives of / with respect to t and x. 

(d) Calculate □ / in terms of partial derivatives of / with respect to u and v. 

18.2. When polar coordinates are used in the plane, the vectors e, and e 0 are an 
orthonormal basis at every point except the origin, where 6 is not defined. 
The one-forms dr and rdd are dual to these basis vectors. 

(a) Using the definition of the star operator, 

- a> A A — (*CJ, A)(T, - 

calculate *dr and *d0. 

(b) Let / be a twice-differentiable function on the plane. Calculate the 
Laplacian of / 

□ / - - d*d/ 

in terms of partial derivatives of / with respect to r and 0. 

18.3. Translate the following identities into vector calculus notation (/ denotes 

( t a) U d( C d/) = 0. a ° nC f ° rm ’ bUt y ° Ur anSWCrS WlU mVOlVC a VCCt ° r A) 

(b) d(dA) = 0. (In IR 3 , ** is the identity.) 

(c) d{f A) = d f a A + fdA. (Apply * to both sides.) 

(d) d (A a B) = dA a B - A a dB. (Look at (*d*)*(A a B ).) 

(e) d( f * A) = d f a *A + f d*A. (Apply ★ to both sides.) 

18. 4 . Let f{t,x,y,z) be a twice-differentiable function on four-dimensional 
spacetime. Calculate the Laplacian D/in terms of partial derivatives of /. 

18.5. Let B be a vector field in spherical coordinates: B = B r e r + B e e e + B q e q . 
By calculating *d*B, where B is the associated one-form, develop an 
expression for div B . 

18.6. (a) Let A = A x e x + A v e v + A z e z . By evaluating (dd* + d*d) A, where A is 

the one-form associated with X find an explicit formula for the vector 
Laplacian of A in terms of partial derivatives of A x , A y , A z . 

(b) Do the same in cylindrical coordinates, where A — A r e r + A^e^ + 
A z e z . Careful: Th e coefficient of d 0 will be rA g \ 

18.7. Develop a complete expression for curl A in spherical coordinates; i.e., 
start with 

^4 = A,e, + A g e g + 
so that the corresponding form is 

A = A r dr + A 0 rd6 + A^rsin Odcf). 

Calculate *d A, and reinterpret as a vector. 


The purpose of the next discussion is to describe all (finite-dimensional) 
Clifford algebras over the real numbers. Recall that the Clifford algebra C(V, W) 
is completely determined by the vector space V and the quadratic form Q. This 














has the following implication, as follows immediately from the ‘universal’ property 
of the Clifford algebra: 

Let /: V-y V' be a linear map which ‘preserves length’, that is, satisfies 

Q'(f(v)) = Q(v) for all veV. 

Then/ induces a unique homomorphism, cj) f : C{V,Q)^> C(V', Q!) such that 

<M y ) =/(*>) for veV, 

when we regard Fas a subspace of C(V, Q ). In particular, if / is an isometry, that 
is/ is also a linear isomorphism of V with V', then cj) f is an isomorphism of C(V, Q) 
with C{V\ Q). Thus the classification of Clifford algebras is the same as the classifica¬ 
tion of vector spaces with quadratic forms. We know the answer to this classification 
problem : the quadratic form is complet e ly classifi e d by its signature ( p,q ). More 
precisely, let V 1 a V denote the subspace of V consisting of all vectors u which 
satisf y (u,v) Q = 0 for all veV. Then Q induces a scalar product on the quoti ent 
space V/V ± which is non-deg e n e rat e , and w e know from Gram-Schmidt that this 
induced scalar product is determined up to isomorphism by the numbers p and 
g of + and — signs in its orthonormal basis. If we choose a vector space complement 
W to V 1 , so we can write V=W® V 1 , then W is isometric to V/V 1 -. If dim V=n, 
then V is isometric to IR" where the standard basis <5is orthogonal with 

£>(<5,.) =1 1 < i < p, 

Q(dj) = - 1 p + 1 ^ i ^ q, 

and 

C?(<5,-) = 0 p + q <i^n. 

Thus the Clifford algebras are completely specified by the integers (n, p, q) with 
p + q^n. What we wish to do is get some insight into what each of these algebras 
actually looks like. We begin with an elementary but important consequence of 
our observation that an isometry f:V->V' induces an isomorphism (j) f of the 
corresponding Clifford algebra. Consider the map / of V into itself given by 

f(v) =-v. 

It is clearly an isometry and f 2 = id. It ther e for e induces an isomorphism <p f of 
the Clifford algebra C = C(V,Q ) with itself. We shall denote this isomorphism by 
ft), so 

co = (f) f . 

Notice that it follows from general principles that, if/: V-*V’ and g: V' -> V" are 
isometries, then 

= 9 0< ft/ ~ 

Taking V = V' = V" and g = j\ we get 

4) f 2 = co 2 . 

But f 2 — id and, by the uniqueness property of cj), 4> id must be the identity isomor- 



phism of C with itself. Thus 


(a 2 = id. 

\Ve say that co is an involution of C. 

rpL Q 7^ o radiiiff_ 

]ne ^ giauiug 

Let C 0 be the subspace of C consisting of all c satisfying 

oj(c) — c 

and let C x be the subspace consisting of all c satisfying 

co(c) = — c. 

Then any c can be written as 

c = c 0 + c l c Q =\{c + (o{c)) Cl =i(c-co(c) 






Thus 


as a vector space. Also if 

c o(c) - (— l)‘c and co(c') = (— 1 )‘V 

then 

co(cc') = (— l)‘ +l cc' 
so 

c 0 c' 0 eC 0 if c 0 and c' 0 eC 0 , 
CqC^eC^ —if— CpeCp and c 1 eC 1 , 
c^CpGC^ if c^gCj^ and CpGCp, 


CiCigC» if c,eC andciGCi. 


We can summarize this fact by writing 


C 0 Ci c: C lt 


' l '— '-'1 


cC f 


We say that C is a (Z 2 -)graded algebra. Now V cz C 1 and 1 eC 0 and every element 
of C can be written (perhaps in several ways) as sums of products of (II and) 


of products of elements ofV having an even number of factors, while C 1 consists 
of sums of products with an odd number of factors. 


twisted tensor products 




TJil 


lutthnnl 







from the Clifford algebraic properties. Now set 


= 7 = If- 


Then 


Thus C( —2) is spanned by l,i,j, and k and the above relations are precisely the 
defining properties of the quanternions. This algebra is frequently denoted by H 
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A (x) B with the multiplication law 


space 


la(x)b • la (x>b = aa (x) 


(Notice, in contrast to the twisted tensor product, there is no sign change.) 


We wish to prove the following basic formulas 


C(p,q)®C(\,l) = C(p+ l,q+ 1), 


where = means isomorphism. 

Let u s prove the first of these isomorphisms. Le t V be & p + q vector space (say 
IK'') with a quadratic form of type (p, q)~ and let W=U 2 with its positive-definite 
i nner product. Let y denote the y = e 1 e 2 of C(W) = C( 2). Consider the map 


given by 


il/:V®W->C(p,q)®C( 2) 


(v) = v (X) y vel 


Now 





C(p,q)®C(\,l) = C(p + l,g + 1) 




Now 


C(1)~IB?©E$ 

under the isomorphism sending 

xl + ye -*• (x — y, x + y) 
as can be easily checked (exercise). Thus 

C(-3)~C(1)®C(-2)^(1R©[R)®H 

^ H ©H. 







It is enough to check this for 


b = w l ...w i 


a =v x - .-Vj. 


Now for veV and weW we have (a,w) = 0 in V&W. (We write v instead of 
/(y) = y©0 and similarly for w.) So in the Clifford algebra C(K© W, Q v ®w') 
we have 


vw + wv — 0 






and hence 


ba' —( — 1 ) ij a'b. 

This is just what is required to prove that <1> is a homomorphism. To see that 0 
is an isomorphism, consider the map h: F© W-*C (V, Qy) ® C(W, Q w ) g iven by 

h(v © w) — l^(§)w + y(§)ljj'- 

where l v is th e H of C(V,Q v ) and \ w is the 1 of C(W, Q w ). It follows immediately 
from the definition of (g) that 

h(v © 0)/z(0 © w) + h(0 © w)h(v © 0) = 0 

while h(v ©0) 2 = Q[v)l v and h (0 + w) 2 = Q(w)l w . Thus by the universal properties 
of C(V © W, <2 K0W j the map h induces a homomorphism 

< / >„ : Ciy © W, Q v qw) C(v, Qy) ® C{W . , Q w ) . 

and 

0°0 ft (u© W) = (f)(v®\ w + 1 F ® W) 

= 4>f{v&l w ) + <l> g {l w &w) 

= v + w. 

Thus 0 0 4> h = id on V + W, hence, by uniqueness, 

4>—4 >h = id- 


Thus (f) h is a (right inverse) to 0. A similar argument shows that 0 A °0 = id. Thus 0 is 
an isomorphism._ 


algebra is just A (U) (as follows from the definitions). Now let V be any vector space 
with 


we can write 


v= V 


where the restriction, Q w , of Q to the subspace W is non-degenerate. Then 0 
establishes an isomorphism of C(V,Q) with A(F 1 )(H)C(1T, Q w ). So, up to 
throwing in a twisted tensor product by an exterior algebra, we ma y restrict 
attention to quadratic forms which are non-degenerate. L e t us introduce th e 
notation 


C(p,q) = C(U n ,Q p J 

where Q pq denotes the standard quadratic form on M"(p + q~n) with p + ’s and 
q — ’s. We wish to understand the structure of C(p, q). 

The split case 

For example, in the case p = q, as a possible model for C(p, p ) we could take 
V=U + U* where U is a p-dimensional vector space and we take («, u') = 0, 
(u*, u'*) = 0 for u,u'eU and u* , u'*eU* and 

(m*, u ) = (u*, u) 

(the evaluation map). As discussed in the text we have a description the corres^ 
ponding Clifford algebra in terms of creation and annihilation operators on A(t7). 




Thus 


ye f = ( — l)”~ 1 e i y. 

Since the e f form a basis of V, it follows that 

yv = (— 1)" _ Wy—for all—ve V. 


Hence 


ya = co{a) n 1 ay _ for any aeC. 


Low-dimensional examples 

Here are two important low-dimensional examples. Take V = IR 2 . We claim that 
C(2) = M(2) = the algebra of all 2 x 2 matrices, 
when = means ‘is isomorphic to’. To prove this, let 

0:R 2 -> M( 2) 


be given by 



so 


Q( u ) 2 = 


and M(2) thus has all the desired properties of the Clifford algebra C(2). If we 



/o 

\ 

take the orthogonal basis 

Clifford algebra M(2) are 

W 

1 then the corresponding elements of the 

e H 

'1 0\ 
k 0 -lJ 

and e2 = (? o) 

sn 

V - f 

, e2 -( 

0 !\ 

r * 


i or 


Notice that y 2 = — 1 as required. The even piece, C 0 , of C consists of all 2 x 2 
matrices which commute with y. This is just the set of conformal matrices. The 
subspace C l5 the odd component of C, consists, in this special case, of the subspace 
U 2 itself, identified with the space of symmetric matrices of trace zero. 

We have already investigated the case of C( — 2) and found that it is isomorphic 
to the quaternions. We review the construction. We define i to be the element 




1 in CA — 21 and i to be the element 1 


| Then 




1 



i 2 =- 11, \ 2 =-t ij + ji = 0. 





0 


1 


2 


3 


4 ... 


0 

1 

u 

U + R 

€ 

R(2) 

H H©H 

2 

m D 


U(4) 

3 

_ A _ 

C(2) 




4 


For example, to find C(l,2) we tensor the entry, C, for C(0,1) by ER(2) s C(l, 1) to 
find that C(l,2) ^ C(2). Similarly, to find C(3,1) we tensor the entry [R(2) s C(2,0) 
by IR(2) to find that C(3,1) = 1R(2)® IR(2). But, by the definition of matrix algeb r as, 
U(R) (x) U(l) ^ U(Rl). So IR(2) (x) IR(2) ^ R(4). Similarly, 

C(1,3)^H®R(2) = H(2), 

the algebra of all 2 x 2 matrices with quaternionic entries. (Notice that C(l,3) and 
C(3,1) are not isomorphic.) Similarly 

C(4,0) ^ C(0,2) ® C(2,0) = H (x) R(2) ^ H(2) 

and 

C(0,4) ^ C(2,0) <g> C(0,2) = H (2). 

Thus by working back and forth you will find 


0 


1 2 


3 


q 


4 5 6 


7 


8 


0 U C H (HI © H H(2) C(4) R(8) R(8) + IR(8) R(16) 

1 IR0R 

2—R(2) 

1 C(2) 

4 _H(2) 

■5 — B 0(2)0 H(2) 

6 H(6) _ 

7 C(8) 

8 IR(16) 


The rest of the table can be filled in by moving along the diagonal, tensoring by 
H3(2) for each step. This eight by eight block then determines all the Clifford 
algebra s, as C( n + 8 ,0) s C(n,0)® R(16) and C(0,n + 8 ) = C(0,n)® IR(16) as follo ws 
from a fourfold application of our basic isomorphisms: For example 

To compute C(9) we can start from C(l) and successively tensor by C( —2), 







C(2), C( — 2), C(2). But this is the same as tensoring by 

C(2) (x) C( - 2) ® C(2) ® C( - 2) 

= C( — 4) ® C(2) ® C( — 2) 

= C(6)® C( — 2) 

= C(8) ^ C( — 8) = R(16). 

This same argument shows that 

C(/c + 8)^C(/c)®[R(16) 


C(-k-8)sC(-k)® [R(16). 

Thus 

C(8r + /c) = C(/c)® IR(16 r ) (Bott periodicity) 


C(— kr — fc)gC( — k)®[R(16 r ). 

This completely determines the structure of all finite-dimensional Clifford 

,-v 1 KfO-O_ 

algCOiao. 
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Chapter 19 can be thought of as the culmination of the course. 
It applies the results of the preceding chapters to the study of 
Maxwell’s equations and the associated wave equations. 


19.1. The equations 


Electrostatics and magnetostatics are only approximately correct. For time-varying 
fields they must be replaced by Maxwell’s theory. We begin with Faraday’s law 
of induction which says that for any surface S spanning a curve y the time derivative 
of B is related to E by 


d 
d t 



E. 

Jy— 


We now rewrite this in a form suitable to four dimensions. Consider an interval 
f a,b~\ in time and the three-dimensional cylinder S x fa. hi whose boundary is the 






with respect to t from a to b gives the equation 



ma 






Let us set 


F = B + E a df 

so th at F is a. two-form defined on four-dimensional space. Let C denote the three- 
dimensional cylinder, C = S x [a, b], so that 

dC = S x {b} — S x {a} + y x [a. fr] . 

Now, B is a two-form involving just the spatial differentials and, therefore, must 
vanish when restricted to the side y x [a,b~\ of the cylinder, while df and hence 
E a df must vanish on the top and the bottom. Thus, we can write Faraday’s law of 
induction as 

F = 0. 

he 

We can also consider a three - dimensional region C lying entirely in space at one 
fixed constant time. In this case, 8C will be a surface on which df = 0 so that 



/% 

* 


_ 

F = 

B 

i 

8C J 

dC 


and, by the absence of true magnetism, the surface int egral must vanish. Thus, 
J ac F = 0 for all three-dimensional cubes whose sides are parallel to any three of the 
four coordinate axes. This is enough to imply that d F = 0, where now, of course, d 
stands for th e exterior d e rivative in four-space. Sinc e 

F = B x dy Ad z — B y dx a dz + B z dx a dy + E x dx a dt + E y dy a dt 
+ E z dz a dt, 

the equation dF = 0 is equivalent to the four equations 

dB SB dB> dB y dE dE x _ - 

dx dy dz ’ dt dx dz 

dJxJEy + gj l = 0 d_B 1 _dE x + d_E y ^ 0 

dt dz dy ’ dt dy dx 

We will use Faraday’s law to d e fine B so that the units of B ar e 


voltage-time energy-time 
area charge-(length) 2 


Ampere’s law relates current to magnetism. It says that the electric current flux 
through a surface S whose boundary is y equals the magnetic loop tension around y. 
According to Maxwell’s great discovery, we must write the electric current flux as the 
sum of two terms dD/dt + 4nJ , where D is the dielectric displacement a nd J is the 
of moving charges. (For slowly varying fields, the first term is 






negligible in comparison to the second and did not appear in Ampere’s original 


differential form H , called the magnetic field strength, around y. Thus Maxwell’s 
modification of Ampere’s law says 

f (dD -\f 

Consider the three-dimensional cylinder C = S x [a, b] as before and set 

G = D - H a dt. 

We integrate Ampere’s law from a to b with respect to t and get 





KIMMIlli 




total charge in R) which is 4n\pdx a dy a dz. Thus, for regions which lie in constant 


/e have 


Let us set 


and we see that 


G = 4n \ pdx a dy A dz. 


j = pdx a d_y a dz — J a dt 


= 4n[ j 


d G = 4nj, 


om which it follows that dj 
We summarize: 


F = B + E a dt, 


len Maxwell s equations say 
dF = 0, 


Note that Maxwell’s equations are invariant under smooth changes of 
coordinates. 

We will use Ampere’s law to define H. D has units charge/area so dD/dt has units 
charge/(area-time) and J has units current/area = charge/(area-time). Thus H has 


e 0 has units: 


D = e 0 *E and B = p 0 *H. 


charge length (charge) 2 

- X —— = - 

area voltage energy length 









has units 


energy • lime time-length energy-(time) 2 
charge-(length) 2 charge — (charge) 2 -length' 

Thus l/e 0 jUo has units (length) 2 /(time) 2 = (velocity) 2 . Thus the theory of electromag- 
netism has a fundamental velocity built into it. It was Maxwell’s great discovery that 
this velocity is ex a ctly c - the velocity of light! So introduce cdt inste a d of d t a nd the 
four-d imensional ★ operator becomes 

★(dx a dj/) = — cdt a dz, 

*(dx a d z) = cdt a dy, 

★(dy a dz) = — cdt a dx, 

★(dx a cdt) = — dy a dz, 

★ (dy a cdt) = dx a dz, 

★(dz a cdt) = — dx a dy. 

Then 

★ F — c(B z dz a dt + B y dy a dt + B z dx a dt) 

-(l/c)(E x dy a dz — E y dx a dz + £ z dx a dy)- 
The constitutive relations can now be written as 


G= — —★F. 


i 0 is-a eonstanVwe ean-absorbull eonstants into a- 
redefinition of j and write Maxwell’s equations as 


dF = 0 —and d^F = 4nj. 

From the last chapter, we know a procedure for dealing with this system of 
e quations: Assuming the appropriate topological condition, we look for a one-form 
A with 

d A = F, d*A = 0 
and 


_ — □ A = 4n*j. _ 

Thus th e solutions of Maxw e ll’s e quations are clos e ly r e lat e d to the study of the wave 
operator 

_ d 2 d 2 d 2 d 2 _ 

~ U = dt 2 ~d^ 2 ~dy 2 ~d? 

if we use coordinates where c = 1._ 


19.2. The 




In this section, as a warm-up to the study of Maxwell’s equations, and of interest in 
its own right, let us study the equation - 


d 2 u d 2 u 




If we make the change of variables p = x + t,q = x — t, this equation becomes 

d 2 u ~~~ 

dpdj = ° 

so, by integration, 

u = u x {p) + u 2 (q) 
or 


u(x , t) = u x (x + t) + u 2 (x — t). 


( 19 . 1 ) 


Any such function is clearly a solution. The function u 2 (x — t ) can be thought of 
dynamically. At each instant of time t, the graph of u 2 (x — t ) is just given by the graph 
of u 2 (x) displaced t units to the right. We say that u 2 {x — t) represents a wave moving 
without distortion to the right. Thus, the most general solution of the homogeneous 
wave equation in two dimensions is the sum of two undistorted waves, one moving 
to the left and the other moving to the right. 

We are usually interested in describing the wave motion corresponding to some 






w here c is some constant. So adding and subtracting the equation u 0 {x) 



~X 

2 u 1 (x) = u 0 {x)+ y 0 (s)ds + c, 


2 u 2 (x) = u 0 (x) — y 0 (s)ds — c 

Jo 

So, from (19.1), we see that 



This is just another way of writing our formula. 




•** ™ ■ j. i 

value problem. We also see from the explicit formula that small changes in the initial 
values cause small changes in the solution. We say that the problem is well-posed. 
Notice that the value of u at (x 0 , t 0 ) does not depend on all the values of u and du/dt at 



Figure 19.2. Domain of dependence 


t = 0, but only on the values in the interval [x 0 — 1 0 , x 0 + t 0 ]. Put another way, the 
values at (x.O) only influence those spacetime points which lie in the forward cone 







19.3. The homogeneous wave equation in DR 3 


We wish to solve the initial-value problem 

d 2 u d 2 u d 2 u d 2 u 
dt 2 dx 2 dy 2 dz 2 

«(x, 0) - u 0 (x) , 
du 

~{x,0) = v o (x). 


Let A(t, x) denote the process of averaging over a sphere of radius 1 1 1 centered at x. 
Thus 



(% 

A(t,x)f = 

/(x + t-)T. 

S 


Here/(x + t-) denotes the function o n the u n it , s phere S whose value a t . a n y vector v 
with || v || = 1 is given by f(x + t\) and x is l/47r times the solid angle form. We 
claim that the function u given by 

Q 

w(x, t) = tA(t, x)v 0 + — [tA.(t, x)w 0 ] 

ot 


is the unique solution to our problem. 

We first show that it is a solution and begin by showing that it satisfies the initial 
conditions . For small values of t we c an expe c t - 

U(X, t) = A(t , X)u n + O(f) 

and the limit of A(t, x) is clearly u 0 (x) as we are averaging over smaller and smaller 
spheres centered at x. Thus 


Also 


u(x, 0) = w 0 (x). 


du 

dA 


-(x,0) = lim 

Ot t->o 

A(t, x)v 0 + —(t, x)u 0 



since all remaining terms are 0(0- But 


dA _ 

r _dii 

—(0,xK = 

y.T ( x )*i T 

dt j 

s dxi 




Xit = 0 


i J S 


since the average of an odd function - like the coordinate x t - over the unit sphere 
must vanish. 

To prove that u satisfies the wave equation, it is clearly enough to prove that, for 
any function co of x, the function 


W(x, t) = tA(t, x)co 




c < < neous wave equation in 


> 


satisfies the wave equation. This is because our expression for u is a sum of two 
terms - one of the above form and the other of the form dW/dt. And if TFis a solution 


the wave equation, so is 


For any twice-differentiable function co, we have 



her e S xl denotes the sphere of radius \t\ centered at x. 
Now on the surface of a sphere of radius r we can write 







We can turn this argument around and prove the uniqueness of our solution by 


reducing the three-dimensional homogeneous wave equation to the one- 



dimensional equation. We have proved the integral formula 


lly-x||<r 

So, for a function u(x, t), we have 


Au(y)dy = r 2 


II V| 


= 1 dr 


u(x + vr)dco. 


Au(y, r) = 4nr 2 — A(r, x)u. 

J.ly-xKr Ur 


If u satisfies the equation 


d 2 u 


= Am, 


dV 


then we can substitute into the above equation to obtain 


1 

d 2 u 

r— 

d 2 u 

4nr z A(r, x)u = 

a,2 (*•')- 

Ily-xKr ot 

dp 
hi-, 

* t 2 (y^) ds - 

Ily-x||=r0r 


Differentiation with respect to r now shows that the function 

Z(r, t) = rA(r, x)u(x, v) 

satisfies the one-dimensional wave equation 

a 2 z d 2 Z = Q 


dr 2 dt 2 


Thus 

Z(r, t ) = rA(r, x)u(x, v) 

Since Z(0, t) = 0, we see that 

Wl {t)= -w 2 (-t). 

So writing w for 

Z(r, t) = w(r + t) — w(t — r ). 

Differentiating this with respect to r and letting r -> 0, we get 

u(x, t) = ,4(0, x)u(x, t ) = 2 w'(t). 

Also 


az_ sz 

dr + dt 


= 2 w'(r + t) 


so 


r 7 Cl r 7 

m(x, r) = 2w'(r) = lim 2 w'{r + t) = lim — + lim —. 

t-»o r-o dr ,-> 0 dt 

Substituting the expression Z into this last equation gives 


d ( r 

1* 

r 

du, _ \ 


u(x + yr, 0)dS + ~~ 

-^(x + /r,0)dS . 

Or 

'llyll-l 4'jrrJ 

llj'll — \ 01 J 




Let us first show how to solve this equation with the initial conditions 



We can solve the inhomogeneous wave equation by reducing it to solving the 
homogeneous wave equation with a series of initial conditions as follows. Let 
u(x, t, t) be the solution to the initial value problem 



w(x, t) = — f v(x, t, T)dr 


Then it is obvious that w(x, 0) = 0 and 



and therefore 






while 




Now the function v is given by 



This is th e formula of ret arde d potentials: 


u(x, t ) = 


i P/(y,t—lly —x[|) 

4ttJ II y — x |[ Ly ' 


It looks like Poisson’s formula for the potential, but the contribution from the 
sphere of radius a about x is given by 1/a times the value of / at time a units. 

Once we have a solution for the inhomogeneous equation with zero initial 
conditions, we can then reduce the inhomogeneous equation with arbitrary initial 

our given solution, to solving the homogeneous 

equation. 


uano 


o eacn o 


e componenis oi 


our-potemiai a in 


equations. We obtain 


A(x) = - 


X — X 


current density j. The general solution to Maxwell’s equations is obtained by 





19.5. The electromagnetic Lagrangian and the 


Let A be a one-form and j a three-form on Minkowski space, IR 1,3 . Consider the 

four-form 

J 2 ? = $£[A, J) — — jdA a *dA — 4nA a ★ j. 




ver an 


l.hm.NW.IMlIB.W.'.U.UM.MI 


reeio 


iirmiuBniiintiffn 







iiiniManiii 


of A. We say that A is an extremal of L if 

X"^'n,j(^s)Is=o = 9 


vara ~ ^ 

Write -j— = B\ then 

as s =o 


— (v4 & . ,j) 1 _ o — —1( 


Now 


d (B a *d/l) = dB a *d ,4 + Ba d*d ^4 


d B a *dA = dT a *d£ 


— = B a [d*dT — 4?y] = d(B a *dA). 

as s =o 

Now the boundary integral J 8N B a *dT vanishes since B = 0 on the boundary. Thus 


d L N j(A s ) 


7%- 



ds 

s = 0 » 

B A I 

N 

[d*da — An j\ 



If this vanishes for all B vanishing at the boundary, we must have 


d*dA = An j. 








We have thus derived Maxwell’s equations from a variational principle: Maxwell’. 


There are several advantages to a Lagrangian formulation. One is that, by 
modifying L, we get a procedure for modifying the equations for its extremals 




formulation lends itself to one of the procedures for passing from classical mechani 
to quantum mechanics. The function L plays a key role in quantu 
electrodynamics. A third advantage of a Lagrangian formulation is that every on 
parameter group of symmetries of L gives a conservation law by a method known 




Let £ be a vector field on R 1,3 with the property that 


As we have seen, any conformal transformation of spacetime preserves the star 




I IVflQ rilMIlKfl 




parameter group of conformal transformations. In particular, £ can be any constant 


Now 


Since 


= — jdD^A a *dA —jdAD 4 *dA — D^A a 4nj — A a 4nD^j. 


Ds *d A = *DgdA = ★d D g A 


ID^A a ★ d A — dA a ★ dD^A 


the first and second terms are equal, so they add up to 


a *dT = — i 


a d*cL4. 


Thus 


But d*dv4 = 471 / by Maxwell’s equations. So 


A 471 j 


using d A = 


= - d i\c)r A *r — dUC A a d*< 


A 471J 


= — d[i(£).F a *F] — di(%)A a 4%j — A a 4nD^j 
[★dA = j. On the other hand, 

D^A = i{£,)dA + d A + di($)A 


while 


= - idZ(5)(F a *F) - m)F + di(^)A) a j — A a D g j 



since d (F a * F) = 0 because jF a *F is a four-form. Setting these two expressions for 



- d[i(%)F a *F] - di(%)A a 4nj — A a D^j 



In the absence of currents, i.e., if j = 0, we would find that the three-form 



would be closed. This is interpreted as a conservation law: the integral of C(|) over 



the same answer. 



instructive calculation which we leave to you will show that 
CO?) —idlEH 2 l~ HB|1 2 )dx a dy a dz 


+ Pidy a dz a dt — P 2 dx a dz a dt + P 3 dx a dy a dt 



P = ExS. 















(The three-vector P is called the Poynting vector.) Thus on a surface t = constant 

. i *4 _ 1 r* /^/ f* \ * • ^ 1 , 1 ~ , • 1 J • 1 * 9 


w ^ -- 

conservation law thus becomes the conservation of energy. The remaining 
components of C(g) in this case describe the flux of (local energy density) across a n y 

surface._ 

Just as energy is associated with time translation, the x-component of momentum 
is associated with infinitesimal translation in the x-direction. This is a general 
principle of mechanics - see for example Loomis & Sternberg, Chapter 13, or, for a 
more sophisticated version, Guillemin & Sternberg, Chapter 2. Thus C(d/dx) will be 


a M I Mi g imyi i iimm is iiuihai ink] *>; r«(<g| i m* 111 11 fsn[4 




gives the total momentum of the electromagnetic field in the x-direction. Over 




interval [a, b]. 
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T(4,tj) = C(4)(v) makes sense for some other constant vector t]. The function 
T=T(4,tj) is called the energy-momentum tensor because coded into it are all 
components of energy or momentum density and flux. 

In the presence of a source term j , the forms C(£) are not closed, but satisfy 

dC($) = i(|)F a 4nj. 

This represents an exchange of energy-momentum from the electromagnetic field to 
_the_charge-current._Itis_the_Lftre«tz/r>rc (2 _ 


-9.6. Wave forms and Huyghens’ principle 


A solution of the wave equation 


of the form 


Am = 0 


n(x, t)=f(x)g(h(x)±t) 

is called a progressing wave. The expression h(x) + t. is called the phase of the wave 
and the level surfaces h = constant are called wave fronts. 

In one dimension, we have seen that taking/ = 1, h(x) = x and g arbitrary gives a 
solution (and that the general solution can be written as the sum of two such 
expressions, one with a + and the other with a —). 


h(x ) = k*x 

and f=l,g again arbitrary. The only condition to solve the equation is that 
II k || 2 = 1. (In fact, we have reduced the problem to a one-dimensional problem: By 




we are reduced to the wave equation in one spatial dimension.) 







ave orms am uyg tens princip e 


In particular, if we take g(9) = e ie , we get the functions 

_ e 1(k ' x ~°, | | k| l 2 = l. _ 

The importanc e of these sinusoidal phase wave solutions is th e following: It is a 
theor em that every smooth function on DR 1,3 (or more generally on IR") which 
va nis hes sufficiently rapidly at infinity can be written as a superposition of functions 
of the form e ll v . That is, we can write any such function as 

fo r a suitable function, /. We shall give a rapid proof of this fact in Chapter 21. It 
turns out that this Fourier inversion formula is true in greater generality, and that (in 
nur case), the most general solution of the wave equation can be represented as a 

superposition of the function s e 1(k ' x ~ r) , || k H 2 = 1 - - 

As a second example of a wave form, consider spherically symmetric solutions of 
the wave equation (defined outside the line r = 0). In terms of polar coordinates, we 
have 

1 8 ? 8u 1 d • 1 d 2 u 

r 2 dr dr r 2 sin 9 86 89 r 2 sm 2 9 8(j) 2 

so that if u is sphe r ica ll y symmetric, u = u(r . t ), the wave equation becomes 


8 l u 1 8 2 8u 1 

8u 8 2 u 

rd 2 , ~ 

a*2 „2 a« a« r 

^ 8r + 8r 2 

= rdr 

KJ V t t vt f 




Thus v = ru satisfies the one-dimensional wave equation 



8 2 \ _„_ 

{ dt~ 



The g e neral solution of this equation is given by 


v(r, t ) =/(r + t) + g(r - t ) 

and so the general solution of the symmetric wave equation is given by 


A f{r + t) , g(r-t) 
u{r, l) = — +—- 


Here the first term represents an incoming wave and the second term represents an 
outgoing wave. In particular, if we take / = 0 and g(s) = e lks then 


w k (t, r) 


gi k(r — f) 


represents an outgoing (sinusoidal) wave of frequency k. Indeed, up to normalizing 
constants, it is easy to check that 


gikr 

E k (r) = — 
r 





t 


c o 


is the ‘fundamental solution’ to the reduced wave operator A + k 2 , i.e., 


(A + k 2 )E k = 0 


iui t ~r~ w iiiiv 

1 
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E k ( A + /c 2 )0dx = 0(0) 




for any smooth function 0 . We proved this result for the case k = 0 in Chapter 15 
fs formula. The identical proof works here. 


Furthermore, iff) is any bounded region in [R 3 and u satisfies the reduced wave (or 
Helmholtz) equation 


Am + k 2 u = 0 


then the argument from Green’s formula shows that the following formula due to 
Helmholtz is true: 
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where r P denotes the distance from P. 

In many applications we are interested in the situation where D, instead of being 
bounded, represents the exterior to some surface S. Let us first apply the formula to 
the bounded region D R , consisting of the intersection of D with a ball of radius R 
centered at P. 



Figure 19.5 


If R is taken large enough the previous integral becomes a sum of two surface 
integrals, yielding 
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where Z R is the sphere of radius R. Now 
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and *dr — R 2 dco on H R , where d 00 is the element of solid angle on X R . Thus the 
second integral becomes _ 
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ave forms and Huyghens’ princip e 


thus the integral over the sphere will go to zero as R~* ooif 



r ( 

• 


21 

du 




\u\ dco = o(l) and 
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where the integrals are evaluated for r = R. These conditions are known as the 
Sommerfeld radiation conditions. Thei r significance is that they represent the 
condition that u consists of expanding waves radiating outward and no incoming 
waves*. Let us ass u me that this co n dition is satisfied. Then the value of u outside 
some surface S is given by 
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e ikr / e^ 1, \ “I 


U{P) = 4 Z- 


—*d u—u*d — 

• mu 

V 
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In this way, the solution exterior to S is described in terms of radiation emitted from 
S. It was Huyghens who originally had the idea that propagated disturbances in the 
wave theory could be represented as the superpositi on o f secondary disturbances 
along an intermediate surface such as S; but he did not have an adequate 
explanation of why there was no backward wave , i.e. why the propagation was only in 
t he o utward directi on. The idea tha t the backwa rd waves would cancel one ano ther 
out because of phase differences was due to Fresnel. Fresnel believed that if all the 
sources were inside S, the secondary radiation (i.e. the integrand in Helmholtz 
formula) from each separate surface element would produce a null effect at each 
interior point due to interference. The above argument, due essentially to 
Helmholtz, was the first rigorous mathematical treatment of the problem, and shows 
that the internal cancellation is due to the total effect of the boundary. Nevertheless, 
as we shall see in Chapter 21, Fresnel was right, up to terms of order \/k. This is due 
to a method for giving an asymptotic evaluation of the integrals in Helmholtz 
formula. 

So far, we have dealt with monochromatic radiation u , corresponding to the time- 
dependent function v where v(x,y,z, t) — u(x,y,z)e~ lkt . For a fixed point, P, let v P 
denote the function v P (x, y, z, t) — v(x, y, z, t — r), where r is the distance from P to 
(x, y, z). The substitutio n into Helmholtz’ formula sh ows t hat 

1 f 

v{P, t)= - - (u P *d(l/r) — (l/r)(du/dt P )*dr — (l/r)*dr> P ). 

--_- 


* Fora pr ecise mathematical explanati on of the Sommerfeld radiation con dition s see the book 
by Lax & Phillips, pp. 120-128. The gist of what they prove is the following. Let / = be 

Cauchy data for the wave equation; we thus seek a solution of the wave equation 

d 2 w 

-Aw = 0 _ 

with w(x, 0) =/ x and (dw/df)(x,0) —f 2 . We sa y/i s eventually outgoing if th ere is some constant 
c such that w = 0 for | x\ < t — c. Tf we seek a solution of fixed frequency, then the appropriate 
Cauchy data are {w, i/cw}. Suppose that w is a solution of the reduced wave equation outside 
some bounded domain. Then {w, i/cw} is eventually outgoing if and only if the Sommerfeld 
radiation conditions are satisfied. _ 









This is Kirchhoff’s formula. Since it is linear in v, and does not explicitly involve the 
frequency, it is true for any sup e rposition of monochromatic waves of varying 
frequencies, and hence for an arbitrary solution of the wave equation. In this form 
t he relation with Huy gen s’ principle is very apparen t. ’ 




A Maxwell’s equations 

You should be able to state Faraday’s law and Ampere’s law in both integral and 
differential form and to show that the two-forms 

F = B + E a d t and G — D — H a dt _ 

satisfy G = *F in vacuum. 

You should be able to convert Maxwell’s equations to the form □ A = 4 nj and to 
describe the solutions to this wave equation. 

You should know how to formulate energy-momentum conservation for 
electromagnetism in terms of differential forms. 


Exercises 


19.1. In four-dimensional spacetime, if we use units where c= 1, (dr, dr) = 1, 
(dx, dx) = (dy. dy) = (dz. dz) = — 1 and a = dt a dx a dy a dz. A two-form 
F, which represents the electromagnetic field, may be expressed in terms 
of coordinates x, y,z,t as F = 8 z (x, y, f)dx a dy — E x {x, y, t)dt a dx — 
E y (x, y, t)dt a dy. Its oth e r terms are all zero. 

(a) What relation among the partial derivatives of B z , E x , and E y must 
hold in order that dF = 0? 

(b) Express *d *F in terms of partial derivatives of B z , E x , and E y . 

19.2. A Lorentz transformation a corresponding to velocity v along the x-axis 
may be described in terms of pullback as follows (for convenience, we set 

c- t): 


a*t' = y(t — vx), _ 

oc*x' = y(x — vt ), — 
a*y' — y a*z' = z, 

where y = 1/^(1 — v 2 ). If x, y ,z,t are coordinates used by platform 
observers and x',y',z',t' are used by observers on board a train moving 
with velocity v along the x-axis, then a* pulls back the train coordinates 
x', y’, z’, t’ of an ev e nt into th e coordinat e s of that even in th e platform 
frame of reference. 

(a) Show explicitly that a* preserves the star operator; i.e. that a**cu = 
*(a*co). Using the star Dperator for the coordinates x, y, z, t; i.e., *df = 
dx a dy a dz, etc., check the following: 

— <x**dx' a dy' a dz' = *a*(dx' a dy' a dz'), — 
a**dt' a dy' a dz' = *a*(df' a dy' a dz'), 






oc**dt' a dx' = *a*(dr' a dx'), 
a**dx' a dy' = *a*(dx' a d/). 

(b) In terms of the train coordinates, the potential A may be expressed as 

A - A' t df + A' x dx' + A' y dy' \- A' z dz'. 

Calculate a*A and equate it to A t dt + A x dx + A y dy + A z dz, thereby 
obtaining expressions for A t , A x , etc., in terms of A' t . A' x . 

19.3. Carry out the same program as in Exercise 19.2 for the two - form 

F = B' + E' a dr', 

F = B' x dy' a dz' + • • • + E' z dz' a dr'. 

Calculate *F, compare the result with 

F — B x dy a dz +-h E,dz a dr. 


and— thereby obtain exp r essions fo r the field componen t s B x ,B y , 
B z ;E x ,E y ,E z in terms of B' x ,B y ,B' z ,E' x ,E' y ,E z . 

19.4. For a point charge at rest at the origin in the train frame, we know that 


F = B' + EJ a dr' 


x'dx' + y'dy' + z'dz' 
(x' 2 + y' 2 + z' 2 ) 3/2 


Calculate a*F, thereby obtaining the description of the fields of a charge in 
uniform motion along the x-axis. 

19.5. From the two-form F it is possible to create two different zero-forms: 

fi — *(F a F), f 2 = *(F a *F). 

_E xpress these Lorentz invari ant quan tities i n terms of the c ompo nents of 

F ; i.e., in terms of B x ,B y ... E z . 
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variable. This theory, in its various ramifications, represents the major achievements 
of nineteenth-century mathematics. We shall touch on the key results here. Recall 
from Chapters 7-8 the basic facts of the calculus of several variables, in particular 
of two variables: Let x and y be the standard coordinates on IR 2 . The basic objects 
in the calculus are functions, linear differential forms, and forms of degree 2. Each 
of these may only be defined on an open subset of IR 2 . A function is a rule, /, 








Given any function / its differential, d/, is the linear differential form given by 


,, df A dj' 

df = ^-dx + ^-dy. 
ox dy 

Stokes’ theorem for functions (the fundamental theorem of the calculus) says that, 
if y is any curve going from p to q , then \ y df = f{q) - f{p). 

A two - form is an expression lik e 

Q = cdx a dy 

where c is a function. If D is a bounded region in the plane (contained in the 
domain of definition of Q), then we can form the integral J D Q. If D is given the 
standard orientation, then is just the double integral 



f* 

f* 


CO = 

D 

c(x, y)dx dy. 

D 


(If D is given the opposite orientation, then we must reverse the sign.) All regions 
will be assumed to have a piecewise differentiable boundary which gets an induced 
orientation. This boundary (with orientation) is denoted by dD. If co = adx + bdv. 
then d co is the two-form given by 


WfriH* 


and Stokes’ theorem (for 
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co = 

dco. 
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'dD -J 
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We assume that the reader is familiar with the basic facts about the complex 
numbers. Every complex number c can be written as c = r + is, where r and s are 
real numbers and i 2 = — 1. Here r is called the real part and s the imaginary part 
of c. The number c = r — is is called the complex conjugate of c and cc = \c\ 2 — 


r 2 + s 2 . 




laws such as the commutative and 


for addition and multiplication and the distributive law hold. 


20.1. Complex-valued functions 

We can now allow th e functions (and diff e r e ntial forms) that we have been consider- 
ing to take on complex values. Thus a complex-valued function/ assigns a complex- 
number /(x, y) = u(x, y) + iv(x, v) to each point in the plane, where u and v are real- 
valued functions. In other Words, giving a complex-valued function is the same as 
giving a pair of real-valued functions, i.e., the same as giving a map of (an open 
subset of) [R 2 into IR 2 . We say that / is differentiable if this map is differentiable, 
that is if both functions u and v are differentiable. Thus df/dx = du/dx + idv/dx 
and df /dy — dujdy + i dvjdy giv e the partial d e rivatives of/. W e can consider various 
linear combinations of these partial derivatives with real or complex coefficients. 







For example, we can consider the expression ^ + We can rega rd 

this expression as the result of applying the differential operator Hd/dx + id/d y ) 
to the function f. This operator will be of crucial importance to us in all 


d. U d 


—dz 2\ dx —dv 1 

(20.1) 


and, similarly, we define 


dz 2\dx dy)' 


(20.2) 


The reasons lor this notation will become clear in a little while. Notice that if 
/ = u + iv then 

dfdu . dv 
dz dz dz 


1 du 


■ l ( dv ■ 


2\dx 

dy) 

i- 

h 

dyf 


so, collecting real and imaginary parts, 

df 1 f du diA xfdu dtA 
dz 2 \dx dy) + 2 \ dy + dx / 

Notice thatfor any pair of differenfiabTe functions f and g we~haYer 

_g_,, ,df dg 
d-(fg) d ^9+f d - 


(20.3) 


( 20 ^ 


with a similar equation for d/dz. 

We define the particular complex function z as z(x, y) = x + iy. We write this as 

z = x + iy (20.5) 

and similarly 

z = x — iy. 


z 2 = (x 2 — y 2 ) + i2xy 
z 2 = (x 2 — y 2 ) — i2xy 


Similarly we can form polynomials in z alone such as 17z 3 — 5z 2 + 2z — 3 or poly- 
nomials in z or mixed polynomials like 3z 2 z — 2zz 3 + z 2 — 5z + 1, etc. All these 
are examples of complex-valued functions. If we substitute z for / in (20.3) where 




(20.6) 


and, similarly, 



20.2. Complex-valued differential forms 


We can also consider complex-valued differential forms. A linear differential form 
is an expression 

adx + bdy 


where now a and b are complex - valued functions. The notion of line integral and 
the (zero-dimensional) Stokes’ theorem are as before, with complex numbers as 


values. We define the linear differential forms dz and dz by 

dz = dx + i dy, and dz = dx — i dy. 


Notice that th e s e diff e rential forms ar e lin e arly ind e p e nd e nt at e ach point and in fact 


dx = -(dz + dz), dy = (dz — dz). 

~2 --—zi— 


We can thus write any linear differential form as 

a dx + b dy = A dz + B dz 


where 


A =j(a — ib) and B = j(a + i b). 


In particular, for any (complex-valued) differential function /, we have 


Pf, . a/. 


d/=V^dx + ^dy 


dx 


dy 
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2\dxdy 
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We have already used this remark in the preceding section to conclude that 
(d/dz)P(z) = 0, i.e., that any polynomial (in z alone) is holomorphic. Similarly any 

uotient of two polynomials, P/Q , is holomorphic in 
a region D, provided Q does not vanish anywhere in D. 

In view of (20.7), a funct i on f is holomorphic if and only if 

d/ = hdz - - pm 




valently, we can write this condition as 

d(/dz) = 0. (20.10) 

Let us write f =u + iv where the real-valued functions u and v are the real and 
imaginary parts of /. Then, setting the real and imaginary parts of df/dz equal 
to zero, we see by (20.3) that 

du dv du — dv 





of first-order partial differential eq 
and only if the Cauchy-Riemann 

-__ +1 _....... ~ 

[uations for u and v. Again, / 
equations hold. 

is holomorphic it 
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differential form 

udx + v dy. 

Then relative to the Euclidean metric in the plane 
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Complex analysis 


Thus the Cauchy-Riemann equations are equivalent to 


and 


d (u dx + v dy) = 0 


d*(t/dx + vdy) = 0. 


Notice that they are the analogues , in U - , of the Maxwell equations in IR 1 - 3 . \ n 
Maxwell’s equation we dealt with a two-form on four-space. The Cauchy-Riemann 
equations are for a one-form on two-space. As we remarked in Chapter 18, thg 
star operator. A" /2 -> A" /2 (n even) depends only on the conformal structure. 

Let us think of / as giving a map of (a subset of) IR 2 into IR 2 , sending (x, y) into 
( u{x , y), a(x,,y)). The Jacobian matrix of this map is 

^ du du) 
dx dy 
dv dv 
—\dx — dy- } — 


Equation (20.11) says that this matrix has the form 


'a -b" 


a 


. Now a matrix in 


IR 2 is of this form, with a 2 + b 2 # 0, if and only if it is conformal (i.e., preserves 
angles) and orientation-preserving. Thus we can rewrite (20.11) as 


f du du\ 


dx 

dy 

dv 

dv 


either is zero or is conforma l and orientation-preserving . (20.12) _ 


dy) 


Let f =u + iv and g = r + is be complex functions. We think of/ and g as maps 
of (subsets of) [R 2 into IR 2 . If the image of g lies in the domain of /, then we can 
form the composite function f°g. By the chain rule, the Jacobian matrix of f°g 
(thought of as a map of IR 2 into IR 2 ) when evaluated at (x, y) is the product of the 
Jacobian matrix of /, evaluated at (r(x,y), s(x,y)) with the Jacobian matrix of g, 
evaluated at (x,y). Now the product of two conformal, orientation-preserving 
matrices is again conformal and orientation-preserving. Also the product of the 
zero matrix with any matrix is zero. Thus it follows from (20.12) that the composition 
of two holomorphic functions is again holomorphic. The expression that we use 
for the composition of two holomorphic functions is standard: if, for example, / 
is given by /(z) = e z and g is given by g(z) — 3 z 2 — 2, then f°g(z) — e 3z2 ~ 2 . Also , since 
d/dz is a linear combination of the operators d/dx and d/dy (with constant coeffi- 
cients), the chain rule implies that 


d_ 

dz 


(f°g) = 




(20.13) 


So, in the cas e of th e prec e ding e xampl e , 


d 
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writing (20.12) as one single condition. Consider the matrix 

J= r° -!\ = 


IW !*■ KW avUlWll 1 


1 0 



commutes with J , i.e., satisfies 


if and only if a = d and c = — b, as can be seen by multiplying out both sides. 
Thus AJ = J A if and only if A has the form 

(where now a and b can both be zero). Thus we can write (20.12) as 

— (-du — Bu\ - (-du —< h£ — 

/'O -1W0 -1\ & n ^ 

— gy dv -V_p ) \ 1_ oj—fa & _ — 

dyj dy-y— 

We can give an interesting interpretation to the condition AJ = JA and thus to 


(20.14) 
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two-dimensional vector space U 2 into itself. Now we can identify U 2 with the 

—_/ x\ , 

one-dimensional complex vector space C by identifying the vector J with the 

\ T / 

complex number x + iy. Multiplication by i sends x + iy into — y + ix. But the 

- f — y\ f x \ ■ 

complex number — y + ix corresponds to the vector =J\ . Thus J is 


complex number —y + ix corresponds to the vector * J = Jy J. Thus J is 

the matrix of multiplication by i from the real point of view. Now any linear 
transformation of the one-dimensional complex vector space C is a linear trans¬ 
formation of R 2 . But not every linear transformation A of M 2 over the reals 
corresponds to a linear transformation of C over the complex numbers. A complex 
linear transformation must commute with multiplication by complex numbers, in 


Xll 
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satisfy AJ = JA. If A satisfies this equation, so that A = 




A j corresponds to (a + i b)(x + iy), Multiplication by a complex number on C is 

obviously complex linear. Thus we can write (20.14) as 

(dudu\ _ 


dxdy 
dv dv 


The transformation 


is complex linear. 


(20.15) 


dy) 


So far none of the above equivalent formulations, with the exception of the 
Cauchy-Riemann equations (20 . 11), would be f a mili a r to a nineteenth-century 
mathematician. We now turn to the classical formulation, one which is closely 
related to (20.15). Let / be a differentiable function, not necessarily holomorphic. 


_/ x\ 


MM 
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( be a point in the domain of definition of F. Let | 

W 

| be a small vector. 


Then we have 
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by th e definition of differentiability. ^Here the part ial derivatives are evaluated at 

. Let us write z = x + iy and h = k + il and write /(z) instead of / 

So we may write the pr e ceding equation as 

fiz+h)-m=^k+^i+o(m 

ox dy 

By (20.7) we can write this as 




f(z + h)-f(z) = ^-h + ~h + o(\h\). 
dz dz 


This is an equation involving complex numbers, so, if h ^ 0, we can divide both 
sides of the equation by h so as to obtain 


/(z + h)~ / (4 

h 


df dfh 
dz dz h 


+ o(l). 


Notice th a t the left-hand side looks like an ordinary difference quotient, but relative 
to the complex numbers. Suppose we let \h\-> 0. The value of h/h can be any 
complex number of absolute value one. (For instance, if h is real, then Ti/h = h 
while, if h = si is pure imaginary, then Tijh— — 1.) So the limiting expression on 
the right may depend on the angle at which h -*■ 0. If we want this limit to exist, 
that is, to be independent of the angle of approach, we must have df /dz = 0 at 

the point (^j- Thus if / is holomorphic, we conclude that the limit on the 

left-hand side exists at all points in the domain of definition of / and this limit equals 
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Qf /dz which is a continuous function. Conversely, suppose that at each point the 
limit 


T in + ») - /<*) 

/i-0 « 


=/'(*) 


exists (independently of how h approaches zero) and is a continuous function. 
Then taking h real shows that df/dx exists and is continuous and taking h purely 
imaginary shows that df /dy exists and is continuous. Thus / is continuously 
differentiable. The preceding argument then shows that df/dz = 0 and f'(z) = 
df/dz. Thus we have proved that / is holomorphic if and only if 


lim 


f(z + h)-f{z) 

h 




(20716) 


e xists (independently of how h approaches zero) and is a continuous function. In 
short, we can formulate (20.16) as saying that / is continuously differentiable from 
the complex point of vie w. T his is the approach that wa s ma inly taken in nineteenth- 
century mathematics. It is somewhat deceptive in that the condition of complex 
differentiability is so much more restrictive than the standard notion of differen- 
tiability. Nevertheless we shall adopt f\z) as a more convenient notation that 
df/dz. 

We have observed that Stokes’ theorem and (20.10) imply that 



f (z) dz = 0, 

(20.17) 


1 8D 


which is Cauchy’s integral theorem, for any domain D contained in the domain 
of definition of a holomorphic function f. Conversely, if / is any function, then 
J 3D /(z)dz = J D d(/dz). This integral can vanish for all D only if the integrand 

vanishes identically, i.e., if f is holomorphic. _ 

We have thus seen that the conditions (20.8)-(20.17) are all equivalent . A function 
that satisfies any one (and hence all) of them is holomorphic. 


20 A. Th e calculus of residues 

We begin by calculating the line integral J y z"dz where y is a circle centered at The 
origin. If n ^ 0, the function z" is holomorphic in the entire plane and it follows 
from Cauchy’s theorem that the integral is zero. For negative values of n, the 
function z" is not holomorphic (and not defined) at the origin and hence we must 
evaluate the integral directly. If y has radius r, we can use polar coordinates to 
write z = re ig and thus, along y, dz — ire lfl dfl. Thus 

z”dz = i r n + 1 e 1(n +1)0 dfl _ 


undr 



r : 

rin 


z" dz = ir n + 1 

e i(n+1)e d G. 

- w 

U---J 

0 
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This last integral vanishes except when n= — 1, when it equals In. In this casi 
r n+1 — 1 and so 
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If we now substitute the above expansion for /, and integrate term by term, all 
the negative powers give zero except a _ x z~ \ while J y g dz = 0 since g is holomorphic. 



Now there is nothing special about the origin, which we can replace by any other 
point in D, or by a finite number of points. A function / is said to have a pole at 
the point a if it is holomorphic in U — {a} where U is some neighborhood of a 
and if (near a) it has an expansion 

/(z) = a _„ (z-«)-"+ ••• + a _ 1 (z — oc)- 1 + g 
where g is holomorphic near a. The number a,, is called the residue of / at oc. 






small circle about a equals 2nia_ l = 27nres a (/). A function which is holomorphic 
in a domain D except for poles is called meromorphic. 



Figure 20.3 



Suppose that / is meroirforphic in D with a finite number of poles. Then we can 
draw little circles about each of the poles, and app l y the Cauchy integral theorem 
to conclude that 


/(z)dz = 2m £ res a (/). 

dD poles a 






This is known as Cauchy’s residue theorem. It is very useful in the computation 
of definite integrals. For example, suppos e w e wish to evaluate the definit e integral 



fin 

d 9 

4 . 

0 u. T v 

where a > 1. We write this as 


1 



2 

* 

Jo a + cos 9' 


Let us set z = e i<? so that, on the unit circle, dz/iz = d0. Also, 2 cos 9 = e i0 + Q~ ie = 
z + z" 1 . Thus the preceding integral becomes 


1 

r dz i 

r dz 

_2 

y iz(fl + j(z + z -1 ))—i_ 

(z 2 + 2az + 1)- 


where y is the unit circle. But 

z 2 + 2 az + 1 = (z — a x )(z — a 2 ), 

where 

<x t = — a + y/(a 2 — 1) 

and 

= ~ a ~ >/(a 2 ~ 1) 

and we have the partial fraction expansion 

-1-- -4-L- -4- 

z 2 + 2 az + 1 «i — a 2 \z — a - ! z — a 2 

Now, since a > 1, it follows that a 2 < — 1 and so a 2 lies outside the unit circle. 
Since = 1, it follows that tx 1 lies inside the unit circle and the above partial 
fraction expansion shows (since l/(z —a 2 ) is holomorphic inside the unit circle) 
that the residue is 



1 _ 1 
«i — «2 2y/(a 2 - 1)' 

Thus, by Cauchy’s residue theorem, 



f* d 9 1 

dz n 




o a + cos 9 2i % 

y z 2 + 2flz+l yj{a 2 — 1)’ 


We shall soon see that any rational function R(z ) = P{z)/Q(z) is meromorphic with 
poles located at the zeros of q. (This follows immediately from the partial fraction 
expansion for rational functions. But we will not assume this here.) We shall also 
get an effective way of evaluating the residues. This will, for example, allow us to 
evaluate integrals of the form $ n R(cos9,sinO)dO by setting z = e i0 , cos 9 = 
}( z + _1 ), and sinz = (l/2i)(z —z _1 ) and proceed as above. 

We begin with a lemma. Suppose that g is holomorphic in D f except for a finite 


f D is the closure of D: it includes all points of D plus points on the boundary. 




number of points a l5 ...,a k . Suppose that at each one of these points the function 


lim |z — a| \g(z)\ — 0. 


ff dz = 0 . 

JdD 

Indeed, by putting little circles around each a f , we conclude as before that 
U?dz = £f», 0 dz. Now 

/* 

g(z)dz ^ 2nrM r 

Jn_ 

wher e r is the radius of y t and M r is the maximum of \ g\ on r t . By assumption, 
rM r -> 0 as>-»0. Thus, by shrinking the radius of the circles, we conclude that 
J ag6f dz = 0. _ 

Now let / be holomorphic in D and let a be a point of D. We apply the preceding 


g(z) = 


f(z) —f(a) 


z — a 


The function g is holomorphic in D except at the single point a. Furthermore, 
(z — a)g(z) = f(z) — f(a) and this tends to zero as t-+a. Thus 



For convenience we shall replace a by z and z by ^ and write Cauchy’s integral 
formula 




( 


We use the following lemma. Let 0 be a complex-valued function which i s 
defined and continuous on the boundary, dD, of some domain D. Then the functions 






are all holomorphic in D and satisfy 

F' n (z ) = nF n+1 {z). 


We prove this by induction on n. First of all we prove that is continuous in 
D. Let z n be some point of D and choose a so small that the neighborhoods 
— z n I < 2a and I z — z» I < a lie entirely in D . 


\z" — z 


2a and | z — z 0 | < a lie entirely in D . Thus, the distance of z' to all points 


of dD is at least a and hence, for £edD, 

1 


1 


X 


< 


< 


If-z'llf-Zol a 2 


for all such z. Now, suppressing the prime, 

1 1 1 


* 




(z ~ z 0 ) 


so 




r_ 

,_. 1 

r.. 

[ 

IF^-F^z oil = 1 

\Z-Zq] 


dD (Z-z)(Z-z 0 ) 

<|z-z 0 K 

a 

dD 

mom. 


This shows that F 1 i s continuous. Furthermore, 


Fi(z) ~ Fiiz 0 ) _ 

m 


3 D (Z - Z )(t - ZoY 


Keeping z 0 fixed, the function </>(£)/(<!; — z 0 ) is continuous on dD. Hence, the 
right-hand side of the above equation is a continuous function of z, and equals 
the left-hand side at all z^z 0 . Letting z->z 0 shows that F 1 is differentiable in 
the complex sense and that 


wr 


F’M - 


— F 2( Z o) 7 


(Z-zoY 


Now 


T 


+ 


(z ~ z 0 ) 


T 


(f-z)(f-Z 0 ) (f - z) 2 (£ - z 0 ) it-zy 


so 


F 2 (z) - F 2 (zo ) = 




{Z-z)(Z-z 0 ) 






2 + (z - Z 0 ) 




(g - z) 2 (Z - Z 0 ) 


showing by the preceding argument that F 2 is continuous, and hence that F x is 
holomorphic. Also 


F 2 (z)-F 2 (z 0 ) RM-RM 


+ 


Z — Zr 


m 


(£ - z)\z - z 0 ) 








mu 


uiUxs/un 


IlilMWI/iWO 


the limit in the above equation, we conclude that F 2 is differentiable in the complex 
s ense and that 

F 2 = IF 3 . 

1 1 1 • ,1 • 1 . • n . 1 -a 


llvy vv vivui uvn w v/wwv* ia* ^ r r tutu. r»u nu v c 

proved that FJ,_i = (n — 1 )F„for all continuous functions <j>. Using the identity 


FJz) - FJz n ) = 


d £~ r H +(^-^c 


4>(£) d £ 

- zm - zr 


= _ fz) - _ !(z 0 ) + (z - Z 0 ) 


(£-z)"(£-z 0 )’ 


where the functions 


RM = 


^ df 


* v/ )tt-zm-z 0 )- 

are associated to the continuous function </>(/)/(£ —z 0 ). We see that F„ is 
continuous. 

By induction F' n {z 0 ) = nR„(z 0 ) = nF„ + 1 (z 0 ), so dividing the preceding equation 


the induction. 

We now apply this result to the Cauchy integral formula, (20.18). Since / is 




jm» 


InliJlvimu 


be infinitely differentiable there, each of its (complex) derivatives f (k) is holomorphic, 


and that 



There are many important consequences of (20.18) and (20.19) which we will 
derive in the next few sections. We begin with the principle of removable singularities. 
Suppose that a function / is defined and holomorphic in D — {a} where ueD; i.e., 







Proof. Let/ (a) denote the value of the extended function at a. Since the extended 
function is holomorphic, it is continuous at a, and lim Z _ fl /(z) = /(fl). Therefore 
lim Z _ fl (z — a)f(z ) = lim Z _ >a (z — a)/(a) = 0, which proves the necessity. To prove the 
sufficiency, draw a circle y about a which lies entirely within D. Consider the function 



h aboi 


a lying inside y. Then, at all points in the annulus between y x and y, Cauchy’s 
integral formula (20.18) says that - 

/(z)= Trm di _Tf 

- — 2ttiJ,{-z ‘ 27riJ 7 ,{ - z - 

By hypothesis, if we hold z fixed and shrink y x to a, the second integral goes to 


/M ■ rau 


for all z # a inside y. This shows that the right-hand side gives a holomorphic 
function, defined throughout the interior of y, which coincides with / at z ^ a. 
This extension is clearly unique since Cauchy’s integral formula must hold for it. 
We shall continue to denote this extended function by /. 

Let / b e holo morphic in a domai n D and let a be a point o f D. Let us a pply 
the principle of removable singularities to the difference quotient" 


f i(z)=“ 


z — a 


The function f x is clearly defined and holomorphic in D — { a } and satisfies 


nm z _ a (z -, 
out D and 


Lence it extei 


tunction aetinec 



Thus 



w ca cu s 


We can apply the same result to so as to define the function f 2 by writing 
//z) =/i(«) + (z - a)f 2 (z) and in fact take the method n steps: 


f(z)=f(a) + {z-a)f l (z), 

fi(z) =/i(a) + (z - a)f 2 {z), 


f n -l(z) =fn- l(fl) + (Z “ «)/«( Z ) ? 

so that 

/(z) =/(a) + (z - + • • • + (z - a) n ~ l f„- t (a) + (z - a)"/„(z). 

Differentiating /c^n times, we see that f k (a)— (\/k\)f {k) {a), so we get the Taylor 
expansion with remainder 

/(z) =/(a) + (z - + ■ ■ ■ + (z - a)“/ , (z) , (20 .2 0) 


where, as above, 


/.(*) = ■ 


m 


2n\ J y (£ - a) n {£, - z) 


dZ 


( 20 . 21 ) 


for all z inside y. 

We can derive a useful estimate on the remainder term from (20.21). Suppose 
|/(0| for C on y. Substituting into (20.21) gives 




M 


«" _1 (R-|z-s 


( 20 . 22 ) 


Since f n (a) = (l/n!)/ <M) (a) we obtain, from (20.21) and (20.22), the formula and 
estimate 


; 

l . » .. i 



f in) (a) = ^ 

_1_2711 

(2(X23) 

(C-z) H +1 __ 
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and 
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if we take y to be a circle of radius R centered at a and where M is the maximum 
of / on this circle. 

Notice the remarkable logic of the situation. We started out with the assumption 
that the function f possesses a continuous derivative of the first order which satisfies 
the Cauchy-Riemann partial differential equations. These equations imply that / 
has derivatives of all orders and that the nth derivative of/, in the complex sense, 
exists and is given by the simple expression (20.23) in terms of the values of 
/ itself. 

We shall draw some consequences of these facts in the next section. 







20.5. Applications and consequences 

As a first illustration of an application of (20.24), we prove Liouville’s theorem 
which asserts that 

A function which is holomorphic and bounded in the whole plane must heT^ 
constant. 

Pr oof. Let | f( z ) \ ^M for all z. In (20.24) take n = 1. We may choose R arbitr aril y 
large, which shows that f'(a ) = 0 for all a. Thus both partial derivatives of/ vanish 
identically and so / is a constant. 

An immediate consequence of Liouville’s theorem is the fundamental theorem of 
algebra which says that: 

Any polynomial P of positive degree has at least one root. 

Proof Write P(z) = a n z n + a„_ l z n ~ 1 H-h a 0 , where n > 0, and a„ ^ 0. Then 

- \p(z) \ ^ [ i flj—( i a n -ii i zr 1 + --±-\a X) u zr n )]\z \ n . - 

For \z\ large enough, the expression in square brackets is > ^\a n \. Thus |P(z) | —>. pn 
as \z\ - » oo and henc e the function 1/P is holomorphic and bounded outside some 
large circle. If P(z) were never zero, the function 1/P would be holomorphic and 
bounded in the entire plane, and hence would have to be a constant. Since P is 
not a constant, this is impossible. Thus there must be at least one zero of P, 
proving the theorem. 

_ If P(q) = 0, then ( z — a)\P{z)/(z — q )l ->0, a nd so we w ould a pply the p ri nciple 
of removable singularities to conclude that P(z)/(z — a) is again a holomorphic 
function. In fact, since z J — a J = (z — a)(z i ~ 1 + • • ■ + a J ~ 1 ), it follows that P(z) — P(a) 
is divisible by z — a, and since P{a) = 0, we see that P{z)/{z — a) is a polynomial of 
degree n— 1. If n> 1, we can apply the fundamental theorem of algebra once 
again. We conclude: 

A polynomial of degree n has exactly n roots (counted with multiplicity). 

As our next application we show that a holomorphic function, defined in some 
connected region D, cannot vanish together with all its derivatives at some point 
a inside D unless i t is identically zero. Indeed, suppose that f(a) = 0 and f (n) { a ) = 0 
for all n. Let us choose R so that the disk of radius R cente r ed at a lies entirely 
within the region D. We first show that / is identically zero in this disk. By (20.20) 



Figure 20.5 







and (20.22) we know that 




starting with one centered at a and ending with one containing b to conclude that 
/ ( b ) = 0. We have thus proved: 

Suppose that/ is holomorphic in some connected region D and/ vanishes 
with all its derivatives at some a inside D. Then / is identically zero. 

Suppose that a n is a sequence of points in D such that a n ->a where a is also a 
point inside D. Suppose that /( a„ ) = 0 for all n. Then / (a) = lim /(a„) = 0 and 
f'{a) = lim(/(a) — f{a n ))/(a — a n ) = 0. We claim that in fact all the derivatives 
f (k \a ) = 0 and hence /= 0. Indeed, suppose that k is the first positive integer with 

r\ no on '\ f(^\ — („ „\kr („\ _x n find Viati/*_ ( 7^ fl fnr 









enough to but not equal to a. This contradicts the hypothesis that /( a n ) = 0 with 
a„-+a. We express this by saying that the zeros of a (non-trivial) holomorphic 
function must be isolated. If/ and g are two holomorphic functions, we can apply 
the preceding result to the holomorphic function / — g to conclude that: 

If/ and g are holomorphic in a connection region D, and if/(fl„) -- g(a n ) at a 
sequence of points a n having a limit a = lim a„ lying in D, then f~g 
throughout D. 


This result shows how strikingly different the theory of holomorphic functions 
is from the theory of C°° functions of a real variable. The function / defined by 


/(*) = 



for x > 0, 
for x ^ 0, 


is a C 00 function on U, which coincides with the id e ntically zero function for all 
negative x. Thus the behavior of a C°° function in one region of its domain of 
definition has no effect on its behavior on some other region at some finite 
distance. This is not the case for holomorphic functions. Once we are given its 
behavior on some small portion of its domain of definition, it is completely 
determined throughout. 

We have already seen that a holomorphic function, g , cannot vanish to infinite 
order at any point interior to its domain of definition. So, if g(a) = 0 at some such 
point a, then there is some smallest k with g (k) (a) # 0 and thus 

g(z) — (z — a) k g k (z) # 0 for z near a, z ^ a, 


and where g k is a holomorphic function. Suppose that / is also holomorphic in 
the same region and let us consider the quotient function h = f /g which is defined 
for z near a,z # a. Let us use the expansion (20.20) of / about a: 

f(k~ l)/gA 

/(z) ^f(a) + (z - «)/'(«) + • • • + (z - af - 1 - + (z - a) k f k (z). 

{k-l)\ 

Then 


-m r- 


TW 




h(z) =f(z)/g(z) = 


+ 


+ 




( z ~ a ) k Gk(z) (z-a) fc 1 g k (z) (z - a)(fe - 1)! g k (z) — g k (z) 

Since \/g k is holomorphic near a, we can expand it about a as well: 

V9k( z ) = b 0 + b l (z-a)+ ■■■ +{z- afb k (z) 

where 


b 0 = l/g k (cc)=l/g {k) (a) etc. 

If we substitute this expansion into the preceding expression for h, we see that 


h(z) = a_ fc (z — a) k + a_ k + 1 (z —a) (fe + ••• + a_/z — a) 1 + h 0 (z) 

where 


a -k =/(a)&o =/(a)/» k (a), j 



f fiQ iioiumorpnic near a, 

a —k + 1 — f(x)b l + f (a)b 0 






Applications and consequences 


etc. We have thus proved that: 

The quotient of two holomorphic functions is meromorphic. 

Th e preceding formulas simplify if g has a simple zero at a, i.e., if k = 1. In that case, 
we can assert that: 


If g(«) = 0 a nd g’( «)^0, then f/g has a simple pole at a with residu e 
/(a)/gf'(a), i.e.,f(z)/g(z ) = a_ 1 (z-a) _1 + h 0 (z ) near a with =/(a)/g'(a) 
and h 0 (z) holomorphic near a. (20.25) 


The number k that we have been using in the preceding analysis is called the 
order of the zero. Thus a holomorphic function g has a zero of first order at a if 
g(a) = 0 and g'(a) j- 0. I t has a zero of order k at a if g(a ) = • • • = g ik ~ l) (a) = 0, while 
g {k) ( a) # 0. 


Let the function / be holomorphic in a bounded region D and suppose that f 
has a finite number of zeros in D. That is, we suppose that there are a finite number 
of distinct points , a f at which / vanishes and let k ; - be the order of the zero 
of/ at cr.j. Then/(z) = (z — a 1 ) kl g(z), where g vanishes only at a 2 ,...,a r and has 
the same order at each of these points as/. We can write g(z) = (z — a 2 ) k2 h(z) where 
h vanishes only at « 3 ,..., « r . 


/(z) = (z - ai) kl (z - a 2 ) fc2 ... (z - a^Ffz) 


where F is h olomo r ph ic in D 
and dividing by / shows that 


any w here in D. Differentiating _ 


fci k f(z) 

/(z) z — a x z — a r F(z) 


where holomorphic in D. In particular, if g is some other function holo- 

morphic in D, then gf If has only simple poles located at the a 7 - with residues 
kjg(<Xj). Thus by Cauchy’s formula we get the following result: 


Suppose that/ and g are holomorphic in a bounded region D and that/ has 
only finitely many zeros in D located at a 1? ..., a r with orders /q..... k r and 
/V 0 on dD. 

f f 9(C)f^dC = K lfl r(o,) + ••• + P 026 T 

2ni J cd j (C) 


In particular, taking g = 1, we get 


1 

2m 


dD 


m 

m 


dC = k t H-h k r 


= the number of zeros of / 
counted with multiplicity. 


(20.27) 


Notice that if / is holomorphic on a larger region, £, such that D is contained in 
£, then / can only have finitely many zeros in £>, for otherwise f would vanish at 
a sequence of points converging to a limit in E, which would imply that / is 
identically zero. So we can readily guarantee that the hypotheses of (20.26) and 
(20.27) are satisfied. 







Rouches theorem says that: 


in a bour 


D = Du dD, and if | f(z) — h(z)\ < |/(z)| on D, then/ and h have the same 
number of zeros (counted with mnltinlicitvl in D. (on ->os 



function h t defined by 


is holomorphic in D, does not vanish on dD, and has a finite number of zeros it 




i = n, so 

and h must have the same number of zeros, proving Rouche’s theorem. 

Suppose that a function / is holomorphic at all points z # /? but near /? t a nd 
that |/(z)|->oo as z->/?. Then l//(z)#0 for z near enough to (3 and so 1// is 


loiomorr 


z near [3,z 


is met 


a removable singularity at (3 so that 1// becomes holomorphic at /? if we assign 
the value zero there. Since 1// is now holomorphic near and including /?, and is 




HUGHS! 


where g s is holomorphic near /? and gr//) # 0. But then 

f(z) = {z- p)~ J h{z) where h = X/gj is holomorphic near ft. 

If we use the Taylor expansion of h about /?, we see that / has a pole of order j 
at ft. We have proved that: 

If/ is holomorphic for z near (3, z ^ fi and |/(z)| oo as z ->/, then / has a 
pole of finite order at [3. 

Suppose that / is meromorphic in a bounded region D and continuous (and no¬ 
where zero) on dD. Then / can have only a finite number of zeros and poles in D. 
Dealing with each zero or pole one at a time as above, we conclude that 

\ _ (z — a i) ki • • • (z — a r ) kr ^ 

where F is holomorphic with no zeros in D. (Here the are the zeros with order 
k t and the (3 t are the poles with order /.) We can then proceed as in the proof of 
(20.25) and (20.26). We conclude: 

Suppose that/ is meromorphic in a bounded region D, continuous in D 
near dD and nowhere zero on dD. Suppose that g is holomorphic in D, and 
continuous on D. Then 






poles of/ with orders j k ,...,j p . In particular, taking g= 1, we get 

1 - P - ftr\ - 


TtO 
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dC 


dD 


— number of ze r os of / — number of poles of/(counted with multiplicities). 

Suppose that / is holomorphic for z # y, z near y. Suppose that y is not a 
removable singularity of / and also that y is not a pole of / (so that we do not 
have |/(z)|->oo as z-+ y). Then y is called an essential singularity of /. Some 
idea of the complicated behavior of a function near an essential singularity is 
expressed by the following result: 

Suppose that / is holomorphic near y except at y and has an essential 
singularity at y. Then given any complex number c whatsoeve r , we can find 
a sequence of points a t -> y such that f (a t ) -> c. 

Indeed, suppose that this did not hold for some c. This means that we can find 
some neighborhood, U, of y such that /( z ) stays a finite distance away from c for 
all z in U, z #y. Say |/(z) —c| > 1/M for some M. Let g be defined by g(z) = 
l/(/(z) — c). Then g is holomorphic in U (except at y) and \g(z)\ < M. So g has a 
removable singularity at y. But then /= l/(g — c) is meromorphic (with at worst 
a pole) at y contradicting the assumption. An example of a function with an 
essential singularity at 0 is the function e 1/z . Let us show explicitly that the 
conclusion of (20.29) holds for this function. We must show that we can make e 1/z 
as close as we please to any complex number, with z arbitrarily close to the origin. 
Let us set w = 1/z. We want to show that we can make e w as close as we like to 
any complex numb e r c with |w| > R for any R. Writ e c = re w a nd w— u + iv. Th e n 
we can arrange that |w| > R by simply choosing v > R. Now e w = e u -e tv , so if r > 0, 
we can choose u = logr and v = 0 + 2nn, where 2nn is large enough so that v > R. 
So we have exactly solved the equation e w = c, and infinitely often by choosing 
larger and larger values of n, which amounts to the corresponding z = 1 /w getting 
closer and closer to 0. The only value of c that we cannot hit exactly is c = 0. But 
we can choose a sequence of c k -> 0, c fc # 0 and corresponding w k with |w fc |-»oo 
and e Wk = c k . Then z k = l/w k is the desired sequence of points approaching 0 with 
e i/Zk ->0. 


It is a somewhat deeper fact, which we will not prove here, that the above 
behavior of e 1/2 is typical of a holomorphic function near an essential singularity. 
That is, we can actually solve f(z) = c exactly, infinitely often near z = y for all 
values of c with perhaps one possible exception. (For e 1/z the exceptional value 
was c — 0.) 


20.6. The local mapping 

In this section we study the local properties of holomorphic functions a little more 
closely. Our first result is an easy consequence of Rou ch e’ s theorem: 





Suppose that/is holomorphic and not constant near a and f{a) = b. Then 
there is an e > 0, such that for all w satisfying |w — b\ < s there is a z near a 
with f(z) = w. (20.30) 


Proof. Set g(z) =f(z ) — b. Then g has a zero of some finite order at a. We can 
find some small enough r such that g(z) ^ 0 for \z — a\ ^ r except at a. Take D to 
be this disk of radius r cente r ed at a and choose e > 0 so that \g(z)\ > f. for zedD 
S u ppose that |w — b] < e and define g w by 

9w(z) =/00 - w = g(z) + (b- w). 

By Rouche’s theorem, (20.28), g w has the same number of zeros (counted with 
multiplicity) in D as does g. Since g has at least one, so does g w , so we can find 
at least one value of z in D with /(z) - w. 

(20.30) is sometimes expressed by saying that a holomorphic function 
defines an open map; that is, if a point is in the image, then a whole neighborhood 
of the point is in the image. 

(20.30) shows another striking difference between the theory of holomorphic 
functions of a complex variable and the theory of smooth functions of a real 
variable. Consider the map of R -> IR sending x into u = x 2 . Then u takes on only 







non-negativevalues. So 0 is in the image of this map but no neighborhood of 0 
is in the image. We can easily jack this example up to a map of the plane into the 
plane by sending (x, y) into ( u , u) where u = x 2 and v = y. In this map the plane is 
folded over along the y-axis so that no points ( u , v) with u negative are in the image. 

(20.30) says that this cannot happen for maps corresponding to h olomorphic 
functions. For instance, if we take/(z) = z 2 , then for w = re' e ^ 0, we can find two 
values of z, namely z = r 1/2 e 10/2 and z = r 1/2 e i(/t h0/2) , with z 2 =w. So instead of 
finding no solutions of z 2 = w for some range of w, we find two solutions. 

This last property of the function / (z) = z 2 is also typical. Suppose that / is a 
holomorphic function near a with/(a) = b. Let us go back to the proof of (20.30). 
The function g defined there has a zero of finite order at a. Let k be the order of 
the zero. If k > 1, then g\a ) = 0 . We c a n choo s e the r (possibly smaller than in the 
proof of (20.30)) so that not only g, but also g’, has no other ze r os in D. Thus in 
all cases, k = 1 or k> 1, g has no zeros in D except at a and g' has no zeros in D 
e x c e pt possibly at a. Notice that g w , defined in the proof of (20.30), differs from g 
by a constant. Thus g' w = g' and so g' w (z ) # 0 in D for z # a. But, for w # b, the 
zeros of g w are not located at a. This means that each of the zeros of g w must be 
simple, and hence there must be k distinct zeros. We have thus proved: 

Let / be holomorphic near a with f(a) = b and suppose that the function g 
defined by g(z) —f(z) — b has a zero of order k at a. Then we can find an 
r > 0 and an a > 0 such that for any w ^ b with \w — b\<s there are exactly k 
distinct z values, z t ,... ,z k , with |z ; — a\ <r and/(z,) = w. (20.31) 

(20.31) shows the true meaning of a zero of order k. It is where k distinct roots 
coalesce. 

The case k = 1 of (20.31) is sufficiently important for us to record it separately: 
Let/ be holomorphic near a with f(a) — b and f'(a) # 0. Then we can find an 
r > 0 and an a > 0 so that for each w with | w — b\ < s there is a unique z with 
\z — a\<r satisfying/(z) = w. In other words, there is a unique function g 
defined for |w — b\ < a satisfying \g(w) — b\<r and f(g(w)) = w. (20.32) 

We claim that: 

The function g defined in (20.32) is holomorphic and g’(w)= 1 lf'{g{w)). 

(20.33) 

We have chosen th e r in (20.32) so that/'(z)#0 for | z — a \ > r. So to prove 
(20.33), it is enough for us to prove that g is differentiable in the complex sense 
at the point b and g\b ) =1/The same argument would then apply to any 
z = g(w) and the formula for g' would show that it is continuous. Suppose that 
/'(tf) = c#0. Let rj<\\c\. By the definition of the derivative of/, we can find a 
<5 > 0 such that 



rw=rw -A«, 

z — a 

<rj for 

\z — a\<d. (20.34) 



l 

t aking d as a new r (it necessary) in (ZU.32) shows that, tor all w close enougn io 
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b , we have 



w — b 

i ■ ■ .. - 


- c 

-z—a - 

<fl 

\c\ SO I 

\w~b\ >i\c(z-a)\ where z = g{w), 


or 



z — a 

2 


(w — b)c 

"|c| 2 ‘ 


If we now multiply (20.34) by (z — a)/c(w — b\ we get 


1 

B5 


g(w) - a 
w — b 


<2\c 



We can make q as small as we like by choosing S and hence e small enough. This 
shows that g is differentiable in the complex sense at b with g ' (h) = l / /' (g), 
proving (20.33).Thc two assertions, (20.32) and (20.33), constitute the implicit 
function theorem for holomorphic functions. We must emphasize that the 
implicit function theorem is a local theorem. Let us look again at the function 
f(z) = z 2 . We know that for any w# 0 there are two roots of /(z) = w. Suppose 
that we specify the square root of a positive number by demanding that it be 
positive; for example, take a — 1 with f{a) = 1. The (20.32) implies that for w near 
to one, we will have specified a unique square root by demanding that it be close 
to one, and (20.33) implies this function g(w) = w 112 will be holomorphic near 
w= 1. Indeed, it is not hard to see that we can take the s in (20.33) to be any 
number less than 1. Of course, we can take any point w in the disc of radius e 
about w = 1 as a new choice of b, with its y(w ) as a new choice of a and apply the 
implicit function theorem once again. Suppose we make such a succession of 
choices with | w | = 1 as indicated in figure 20.10. 

As we come back full circle to the positive w-axis, we end up with the opposite 
choice of the square root, which is not surprising. Thus, although (20.32) guarantees 
the existence of a local inverse for a function /, a succession of applications of 
(20.32) may lead to a global inconsistency. In the case of f(z) = z 2 there simply is 
no globally well-defined function w 1/2 on the w-plane. 





w-plane 



The local mapping 
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There are two ways that are used to deal with this problem of ambiguity and 
W e shall illustrate each of them for the special case of the square-root function. 
One is simply to live with the ambiguity. We think of the z-plane as being a 
two-sheeted covering of the w-plane with a ramification point at the origin where 
only one z-value corresponds to the w-value. Whenever we are given a functional 
expression such as cos(w 3/2 + 1), we understand that this is not really a function 
defined on the w-plane but is a function of z; i.e., cos(z 3 + 1). The z-plane is the 
Riemann surface associated to all functions of w 1/2 in the sense that they a r e all 
defined as functions of z. This was the point of view espoused by Riemann. The 
de tailed study of the structure of such Riemann surfaces for various other kinds 
of holomorphic functions had a profound effect on th e development of geometry, 
topology and algebra well up to the present time. We will not go into these matters 
here. 

An alternative, more mundane, approach to the problem of defining w 1/2 is to 


curve e 


that w 1/2 is not to be defined, so as to make its specification unique everywhere 
else. For example, suppose we agree to cut the w-plane along the negative w-axis. 



ronsaiRi: 


in m 




known as the principal branch of the square root. 
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integrals which will be of extreme importance to us later on. Consider the integral 

e _Ax2/2 dx. 


When Ke a > u, me iunction e ^ vanishes rapidly at ± oo ana me above Integra] 
converges absolutely. When ReA<0 the function e~ Xx 12 blows up at infinity, so 
the integral makes no sense. When Re A = 0, the function e “ Ax2/2 has absolute value 
one for all x, so the integral certainly does not converge absolutely. Nevertheless 
we claim that it does converge. In fact, we claim: 

The inte gral J” 00 e~ Ax2/2 dx converges uniformly when A satisfies Re A >0 
|A| ^ 1 and its value is therefore a continuous function of A on this range 


As.iiiu S3 mxtm a revs m m■. i 



could take |A| ^ c for any fixed c > 0 and have uniformity ol convergence.) Since 
e -ix 2 i 2 a continuous function of A and x, the uniform convergence of the integral 


it i i t t 







ire w-t 


the logarithm function, log w, is not well defined; it is only defined up to adding 


an 


ambiguity. One is to think of the z-plane as being an infinitely sheeted covering of 
the w-plane, and any functional expression involving log w is to be thought of as 
-a function -of z- Xhe-alternative is -to- cut -the-w-plane, -say -along_the-negative -reaL 
axis and then, writing 

w = re ie . —n <6 <n . 


define the principal branch of the logarithm function by 



By (20.32) and (20.33) we know that log w is a holomorphic function whose deriva- 

+ ' • i ! TT ; J 1 J.1- _ 1 _ , 1 a . 1 xl 





As mentioned above, there are two possible interpretations of this definition. 
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then w c becomes defined at all points except for w lying on the non-positive real axis. 


1 20.7. Contour integrals 



definite integrals. We have indicated a typical application in section 20.4. We have 







some of which will be of use to us later on. In all cases, a certain amount 0 f 
ingenuity is required in the choice of contour. One is given a definite integral to 
evaluate. This usually translates into an integral over a curve in the complex plane 
One then adds other curves so as to get the boundary of some region and apply 
Cauchy’s residue th e orem. One must choose the other pieces so that their co ntri- 
bution to the integral either is known, or is some multiple of the desired integral 
or becomes vanishingly small when an appropriate limit is taken. 

(a) Let R(x) = P(x)/Q(x) be a rational function, so P and Q are polynomials 
Suppose that Q has no real zeros. If the degree of Q is at least two more than the 
degree of P, the integral $™ x R(x)dx converge s^ and is the limit of $ r _ r R(x)dx as 
r->oo. We think of this as the integral of the complex function R(z) along the 
line segment from — r to r on the real axis. Adjoin the semicircle of rad i us r 
cent e r e d at the origin in the upper half - plane. 



We now have a closed path which is the boundary of the half-disk in the upper 
i lf-plane, and we can apply the Cauchy residue theorem. On the other hand, 


| R(z) | ^ Kr 2 for z on the semicircle and K some suitable constant. Since the 
semicircle has length nr, this means that the integral over the semicircle tends to 


zero as r->o o. Thus 

* oo 

/->/. \ 1 A V ' T\/ \ 

* 

R(x) dx -- 27ii i_, res R(z), 

- QO \mz> 0 


the sum being taken over all poles in the upper half-plane. 

(b) Consider th e integral e ix R(x) dx wh e re R = P/Q is a rational function 
with no poles on the real axis. If deg Q ^ deq P + 2, we can proceed exactly as 
before. If degQ ^ degP + 1, so all we know is that R(x) vanishes like 1/x at infinity, 
we still can prove that the integral is convergent. In fact, we can use an integration 
by parts argument as in section 20.6. Since e lx = — i(d/dx)e ix , we have j s r c ix R(x)dx = 
— ie ls R(s) + ie ,r K(r) + J^e ,x jR'(x) dx, and R' vanishes to order 1/x 2 at infinity. To 
evaluate the integral, we again take the limit of the integral from — r to r and 
evaluate this integral by adjoining the integral over a semicircle in the upper 
half-plane. Here we must be slightly more careful in estimating the integral over 
the semicircle. If z = re i9 = r(cos 6 + i sin 6), then 


giz _ gircosSg 


- rsin 6 




ontour Integra s 


and so the integral over the semicircle can be estimated as follows: 




/* 

p iz R(7\ 


^ircose^ -rsin0D/,-,J0\~ AQ 


-n 

„-rsin0„| 

1 R(re ie ) 1 dfl 



semi ^ 

■ circle- 


e c i\[rC Jr U(7 

J 0 - 

*7 

0 



jsfow | rR(re' 9 ) | is bounded by some constant, say K. So our problem becomes one 
0 f estimating the integral fge~ rsin M0^Zfo /2 e~ rsil ' e dfl. Now sin 20/71 for 
0^0^ 7i/2. (Indeed, both sides are equal at the end point and the difference has one 
maximum in the interior.) So 



**72 

e~ rsin0 d0^ 

o 

'te/2 

e~ 2r9/n dd = (7i/2r)(l -e“ r )-^0. 

0 

Thus the integral over 

the semicircle goes to zero and we get 


* CO 


C j QA — ^711 / C res iv^Zjj 

—oe lmz>0 


the sum being taken over all poles in the upper half-plane. If Q has a zero on the 
real axis, the above integral makes no sense as it stands. Still, it might make sense 
in some case as a special kind of limit known as the Cauchy principal value, as is 
illustrated in the following example. Suppose that R has a simple pole at z = 0 
with residue A, so that R(a) = A/z + R 0 (z ) with R 0 holomorphic near the origin. 
Let us consider the same line integral as before, except that we make a detour 
along a small semicircle of radius e in the lower half-plane to avoid the origin. 



The enclosed region now contains all the poles in the upper half-plane together 
with the pole at the origin. The integral of A/z around the little sem i circle gives 
7ri7t, canceling out half the residue from the origin. Thus 


P | 


foo \ 

e ix R(x)dx 

| = lim 


* -E 

+ 

1 

e lx R(x) dx = niA + 27ri £ res(e lz jR(z)). 


u 

) 

- 00 / 

£ —* 0 

_ *} 

— 00 % 

I. J 

Imz> 0 


where P{ ) means the Cauchy principal value. For example, taking R(x) = l/x gives 



( 

foo i*\ 

^1 

L 

-00 X J 

I = 7ti. 


Taking the real and imaginary parts of this equation giv es P( { _ ^ (cos x/x ) dx) = 0, 
which is obvious since cos x/x is an odd function, and P(J” D0 (sinx/x)dx) = n. 
Now sin x/x has no singularity at the origin, so we do not have to worry about 
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taking the principal value for this last integral and sin x/x is an even function. So 
we get 



(c) Suppose we consider integrals of the form j’”x“jR(x)dx, where a is some 
real number with 0 < a < 1, and wh ere R is a rational function vanishing to secon d 
order at infinity and with at worst a pole of first order at the origin. Here the 
trick is to first make the change of variables x = t 2 , so the integral now becomes 
2 f pt 2a+ 1 R(t 2 )dt. For evaluating this integral, it is convenient to choose the branch 
of z 2a which is obtained by cutting the z-plane along the negative imaginary axis 
so that w e write z = re 10 with — n/l <6 < 3n/2, so that z 2a = r 2a e i2a9 with 
— noc< 2a9 < 3noc. We must now choose our contour so that it does not come in 
contact with the negative imaginary axis. Choose a contour to consist of two line 



Figure 20.14 

semicircle in the upper half-plane. Our assumptions clearly imply that the integrals 
over the semicircles tend to zero as the large semicircle expands out and the small 



\2a _ „2jtia_2a 


and sc 


(1 — e 2irig ) t 2a+l R{t 2 ) dt. 

Jo 

Since (1 — e 2nia ) ^ 0, we can divide by it to give an evaluation of our original 



and use the branch of z“ determined by writing z = re 10 with 0 < 6 < 2. Notice that 










our original integral as a precise component of a closed contour. Instead, integrate 


i** ' — * * 

half-pl ane ’ with the idea of then passing to the limit to get our original integral. 
we consider a closed contour consisting of two line segments, one just above 
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a large circle. The circle contributions vanish in the limit. The two line segment 



contributions do not cancel each other out precisely because the determinations of 
z“ just above and just below the positive real axis differ by a factor of e 2nia . So it 
is precisely the fact that z* is not well defined that allows us to evaluate the integral. 

(d) We can use contour integration to sum various infinite series. The idea is 
to apply Cauchy’s residue theorem to larger and larger domains where the boundary 
integral s tend to zero. W e are then left with an (infin i te) sum of res idues which 
must vanish. Bringing one summand to the other side of the equation gives us a 
sum of an infinite series. We illustrate this with some sums that are useful in the 
study of trigonometric functions. For this purpose we first define the trigonometric 
functions of a complex variable. We set 

aap ft -I a <r\-r\A cm ft _ __ c* 


These functions are clearly holomorphic in the entire complex plane and coincide 




en io 




= nn. 


= i, an 

complex zeros of cosz are at the real points (n + \)n. We then define 


EE 






sin z , cos z 

tan z =- and cot z = —— 

cos z sin z 






can use (20.25) to compute the residues at these poles since all the zeros are simple. 









radius (n + \)n centered at the origin. Now consider the integral 


(' cot c 
c( 2 ~z 2 


d C 


over these circles, where z is some fixed complex number ^ nn. As we let the circles 
expand out to infinity the integrals go to zero since | cot Cl is bounded, and we have 
a polynomial in £ of degree two in the denominator. The poles of the integrand 
are located at ± z and at nn. The residues at + z are each equal to ^cot z/z and 
the residue at nn is (n 2 n 2 — z 2 ) -1 . So 


cotz 


+ Y — z 2 ) 1 = o. 


Bringing the first term over to the other side, multiplying by z and separating ofT 
the summand corresponding to n = 0, give the formula 


T 


cot z = - + 2z Y 


T 


_ 9 2 2 * 

T z— n n 


(20.36) 


The series on the right converges, and converges uniformly in z so long as z stays 
a fixed distance away from all the poles. If /(z) = sinz, then f'(z)/f(z) = cot z. We 
can also write /'(z)//(z) = (d/dz)(log/(z)) and so write (20.36) as 


d 

dz 



o° A 

= Y^og(z 2 - n 2 n 2 ). 


—The function sin z/z does not vanish at :z — OHbetus integrate the above equation- 
along any path joining 0 to z and staying a finite distance away from all the points 
nn. We get the equation _ 


00 

log (sin z/z) — log 1 = Y (log (z 2 ~ n 2 n 2 ) — log (— n 2 n 2 )). 

-1- 

This equation is somewhat ambiguous as it stands, since the values of the logarithm, 
defined as log g = the integral of g ' /g, depend on the path chosen. But exponentiating 
both sides of the equation eliminates all these ambiguities and gives the famous 
formula 


OQ 

/ -2 \ 


sinz = z]^]| 

l-~hz ] 
V n 2 n 2 J 



representing sin z as an infinite p r oduct. 
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We begin with a straightforward application of Cauchy’s integral formula. 

Let /„ be a sequence of functions all holomorphic in a domain D and 
suppose that /„(z) ->/(z) uniformly in D. Then / is holomorphic in D and the 
f' n (z ) converge uniformly to f'(z) on any subset of D which is a finite 
distance from dD. ~ (20.37) 





Ami s ( 


proof. For any zeD choose a circle C lying with its interior inside D with z inside C. 
Then 


TJz) 


jjzi 


at 


2n\ 


C — z 


Passing t o the li mit gives 


/(*) 


2m 


m 


dC 


and we know from the discussion immediately following (20.18) that this implies 
that / i s holomorphic. By formula (20.19) we have 


T 


J([ 


m= 


2m 


dC 


which clearly converges to 




i f /(C) 


2ra 


■ (C - z) 


dZ 


and the uniformity of the convergence is clear. We may frequently want to apply 
(20.37) to the case where D is a bounded subregion of some larger domain on 
which the /„s are defined and converge. That is, the /„s are defined and holomorphic 
on some large region E and converge uniformly on each of a sequence of subregions 
D k with [j D k — E. By applying (20.37) to each D k , we conclude that the limit 
function / is holomorphic in all of E. 

Applying (20.37) to the partial sums of a series gives 

If a series with holomorphic terms /(z) =ffz) +/ 2 (z) H— converges 
uniformly in D, then the sum / is holomorphic and the series can be 
differentiated term by term. (20.38) 

The most important series associated with a holomorphic function is its Taylor 
series. Suppose that / is holomorphic in a domain D and the disk of radius R 
centered at a point a lies entirely inside D. Let z be any point interior to this disk 
so that | z — a \ < R. Then we can use (20.22) to estimate the last term in (20.20) as 
having absolute value at most 


M|z — a | |z — a\ n 1 
jR — | z — a | R "- 1 


uniformly to zero on any slightly smaller disk. Thus by (20.38) we get 
If / is holomorphic in a domain D and a is any point of D, then the series 


f(a) \- f'{a){z — a) 


~rm 


2 ! 






n\ 


(z —af + 


converges to f{z) in the largest open disk centered at a lying entirely in D. 

. (20.39) 

The series in (20.39) is known as the Taylor series of / at a. Many operations on 




holomorphic functions correspond to simple operations on their Taylor series. For 
ease in e xposition let us tak e a — 0. Then if 

/(z) = a 0 + ayZ + a 2 z 2 + ••• 

and 

g(z) = b 0 + b x z + b 2 z 2 + ■■■ 

then 

f\z ) = a t + 2 a 2 z + 3 a 3 z 2 + • • •, _ 

/(z) + #(z) = («0 + &o) + ( a l + ^ l) Z + ( a 2 + ■+-> 

/(z)g(z) = a 0 & 0 + ( a ^o + ^ o^i) z + ( a 2^o + a i^i + flo^z 2 H— • 

If a 0 = b 0 = 0, then 

f(g{z)) = ad^iZ + ^ 2 g2 H-) + fl 2 (^i z + b 2 z 2 H-) 2 H- ~ 

= a^b^z +{a l b 2 + a 2 b\)z 2 + (a 1 b 3 +2a 2 b l b 2 +a 3 b 2 )z 3 H—. 

From this last equation, we can recursively determine the power series of the inverse 
of a function, /, if we know the power series of /. If we know the as and we want to 
find the bs such that f{g(z)) = z, we get 

a 1 b l = 1 so b l = ai 1 

(we must assume f'( 0) # 0 to be able to invert the function), 

a l b 2 +a 2 bl=0 so b 2 = — ai l (a 2 b 2 ), 
a^b 3 + 2a 7 bib 7: + a 3 b 2 = 0 so b 3 = — af l {2a 2 b i b 2 + a 3 b 2 ) , 

and so on. In general, the coefficient of z" will be an expression of the form 
ci\b„ + (terms involving bs of order less than n) and so we can solve recursively for 
each b n . 

It is an elementary fact that any power series has a radius of convergence, R 
(possibly zero), such that the series converges for \z \ < R, and for no z with \z\> R. 

It follows from (20.38) that the series represents a holomorphic function inside its 
circle of convergence. It follows from (20.39) that, conversely, the Taylor series of 
a holomorphic function has, as its radius of convergence, the radius of the largest 
disk contained in the true domain of definition of /. 

A series of the form a 0 + a_ 1 z~ 1 + a_ 2 z~ 2 + ••• can be thought of as a power 
series in the variable 1/z. So it will converge outside some circle of radius R 
(possibly R = oo). A series of the form J^ c 2 00 a n z n is said to converge if and only 
if its positive and negative parts converge. Th e positiv e part will converge for 
|z| < R 2 for some R 2 and the negative part will converge for |z| > for some R t . 
So there will be some non-empty region of convergence for the double series if and 
only if R t <R 2 , in which case, by (20.38), it represents a holomorphic function on 
the annulus R t <\z\< R 2 . Conversely, suppose that / is holomorphic in such an 
annulus. We shall show that f has such a double series expansion. In fact, to prove 
this, all we have to do is show that / can be written as the sum f=fi +/ 2 , where 
f i is holomo r phic for \z \ < R 2 and f t is holomorphic for | z | > R x and is such that 
/ 2 (l/z) has a removable singularity at 0, so that f 2 ( 1/z) is holomorphic for 












that fx is holomorphic for | z | < R 2 and f 2 is holomorphic for | z | > R l . By Cauchy’s 
integral formula, f x {z) +/ 2 (z) =/(z). If we set z' = 1 /z and substitute C = 1/C in the 


i ntegral defining / 2 , we g e t 
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Jin=!/<-! 4 —z 

which shows that / 2 ( 1/z') is holomorphic in z' for | z' | < l/i^. W e can expand 


in a power series 


f l (z) = Ya n z n with a n = 

1 


-s/—tv—/— / j n n 

0 

2 7ii 

• 

1- rn + 1 -- 

J|CI=r ^ 


and we can expand / 2 ( 1/z') = X”fr„z'" with 
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P fn/n l 

/% 

6 ""2mJ 

r« +1 2ra 

in-i/r ^ 

nor 1 dc 
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expansion, known as the Laurent expansion 


we get the desired 
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Am 


f{z) = £ a n z n where a„ = — 

— on 2711 


id 


-rC 


^rdC. 


(20.40) 


It will be very important for us later on to write out this formula explicitly in the 
case where < 1 < R 2 . Thus we assume that / is defined and holomorphic in 
some neighborhood of the unit circle, \z \ = 1. Let us write F(6)=f{e ie ) and t a ke 
r = 1 in (20.40). Then we get 

/(z) = 5>„z" 


where 


a " 2n 


F(9)e in0 dO. 


(20.41) 


In particular, taking z = e ,e to be on the unit circle and substituting into the Laurent 
expansion of /, we get 


00 

FW) - Y a n e ine (20.42) 

— 00 

with the a n s given by (20.41). The series (20.42) is known as the Fourier series for 
F. If / is holomorphic on some region R <\z\ < 1/R where R is some number, 






R< 1, then, since the a n are coefficients of convergent power series, it follows 
from the convergence criterion for power series that 

|a„|<cr Hn| for any r>R 


for some suitable constant c — c r . 





A Holomorphic functions 

You should be able to define a holomorphic function in terms of two-forms and 
show that its Jacobian matrix represents a conformal transformation. 

You should be able to derive the Cauchy integral theorem and residu e formula 
from Stokes' Theorem. 

You should be able to calculate the residue of a function at a simple or multiple 
pole. 

3_Contour integration_ 

You should be able to evaluate integrals by applying the residue theorem to a 
specified contour, and in a few standard cases you should be able to construct the 
appropriate contour._ 

C Power series 

You should k no w how to expand a holomorphie function i n a Taylor or Laurent 
series and be able to calculate the radius of convergence of such a series. 


Exercises 

20.1. (a) Suppose that f(z) is a holomorphic function. Let z = x 4- iy, /(z) = 

u + iv. Show that the families of cu r ves u(x, y) = constant and 
v(x, y) = constant are orthogonal; i.e. they cross at right angles. 

(b) Verify this property explicitly for the case where /(z) = z 2 . Sketch a 
few curves. Do the same for f(z) = 1/z. What happens at the origin 
in this case? 

20 . 2 . (a) Suppose that /(z) is a holomorphic function whose real part is 

3x 2 y — y 3 . Determine the imaginary part of /(z). 

(b) Can there be a holomorphic function whose real part is xy 2 ? If so, 
find one. If not, explain why not. 

20.3. We can define holomorphic functions cos z and sin z by using the identities 
cos z = j(e iz + e " iz ), sin z = (e iz — e “ iz )/2i and the definition of the 
complex exponential function. 

(a) Express cosz and sinz in terms of trigonometric and hyperbolic 
functions of x and y. Calculate d(cosz) and d(sinz) by using these 
expressions. 

(b) Prove that the addition formulas sin (z + w) = sin z cos w + cos z sin w, 
cos (z + w) = cos z cos w — sin z sin w hold even for complex z and w. 

(c) Using the addition formulas, show explicitly that the functions sinz 
and cosz are continuously differentiable from the complex point of 










view, and calculate the derivatives of these functions by evaluating the 
appropriate limits. 

20.4. Use the technique presented on dd. 736-8 to establish the following 

--* ■» i A 

definite integrals: 

fa) i 

sin 6 Bdf) = —; 

W J 

o 16 

(b) 

"' t cos0d0 na 

a 2 < 1. 

l 1_O n A_1__1__ 
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20.5. Use 

technique (a) on p. 736 to establish the following definite integrals: 

Cco _ A 

—(a) 

uX /l 

= — , a > 0: 

V**/ 

_ ! (x + b) z + a 1 a 

'°° dx n 

(b) 

( v 4 i /i t 4\ 0-3 ’ ^^0? 



_ 

x 2 dx 7C 

"" Jo x 6 +l 6- 

20.6. (a) Suppose that /(z) is holomorphic in a region except at z = a, where 

it has a pole of second order. Prove that the residue at the pole is given 
by the formula 

res = lim — ((z — a) 2 f{z)). 

z - a az 

(b) Use the above result to evaluate 

* 

dx 

- a> (x 2 + « 2 ) 2 * 

(c) Generalize the above technique to evaluate 

r°° 

J — 00 

dx 

(x 2 + a 2 )' 1 

for an arbitrary positive integer k. 

20.7. (a) Here is an alternative approach to technique (b) on p. 736. Let. 

R(z) — P(z)/Q(z) 

where P and Q are polynomials with deg Q ^ deg P + 1 so that 

_ \R(z)\ <.A/\z\ _ 

for some constant d for sufficiently large \z\. Suppose that Q has no 
real zeros. The purpose of this exercise is to prove that the 


rx 7 

lim 

X y —* CO «, 

R(x)c lx dx 

-Xi 

X i~* CO 

exists and its value is 

f R(x)e ix dx = 27ri£ res (R(z)e lz ) 


J — oc 


t 



ip ex ana ysis 

where the sum is taken over all the poles lying in the upper half-plane. 
Step 1. Choose X t ,X 2 and Y large enough so that the rectangle in figure 


+ i Y* 


<X 2 + iV 


Step 2. Show that 


(zle dz 


-Xi I X 2 

Figure 20.16 


C Y e~ y A C Y A 

- dy ^ — e y dy ^ — 

lo l z l X 2 Jo X 2 


for X 2 large enough. 


f?(z)e iz dz <- 


R(z)e iz dz ^ Ae~ Y {Xy + X 2 )!Y. 


4. Let Y-* oo. Conclude that 


— r* -;-; ; Ta rT 

i?(x)e lx dx — 2m £ res (R(z)e lz ) < A I-1-I. 

J-Xi z-0 \2 l2 X\J 

Complete the proof. 

(b) Evaluate Jo (cosx/(x 2 + a 2 ))dx. 

20.8. Sutmose that 010) = 0 while 0'1O) ^ 0 so that R(z)e' z has a simple Dole at 0. 


same as before except that we avoid the origin by following a small 
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20.9. Let /(z) = (/,-/z)) be a matrix-valued function. We say that / is holo- 
morphic in a region D if each of the matrix entries f i} is a holomorphic 
function on D. If y is any curve, define J v /(z)dz to be the matrix whose 
i/th entry is J' y / 0 (z) dz. Notice that if B and C are constant matrices, then 
if / is a holomorphic matrix-valued function so is BfC, and for any 
curve y we have 


Bf(z)Cdz = B 

J y _ 


/(z)dz C. 



t 

o 

o 


o 

lo, 


(a) Let A = | 

0 2 0 

and set f(z) = (zl — A) \ where / =1 

0 1 0 


1 

\0 0 3 ) 


o 

jo 



Show that / is holomorphic everywhere except at z = 1, z = 2, and 


z = 3. 

Let P yj — (l/27ii)J yi /(z)dz. Evaluate P, h for i = 0,1,2 and 3. 

(b) Let D be a 3 x 3 matrix whose eigenvalues are 1, 2 and 3. Let g(z) = 
(zl — D) x . Show that g is holomorphic everywhere except at z = 1, 2 


or 3. Let 



r 

Qy=^~ 

1 g(z) dz._ 

2ni J 

Vi 


Describe im Q yi and ker Q yi . What is Q yi ? Describe im Q yi and ker Q yi . 
What is Q 2 y2 l 

(c) Formulate a general theorem. Let A be an n x n matrix with distinct 
eigenvalues A 1? ...,/l n . Let f(z)=(zl — A)~ l . Where is / holomorphic? 
Let y be a curve that does not pass through any of the eigenvalues and 
is the boundary of a region which contains X x ,..., X k but none of the 
remaining eigenvalues. Set 


1 

r 1 

/% 

Py '~2ni^ 

/(z)dz = — 
2711 J 

{zl — A)" 1 dz. 


What is im PJ What is ker PJ What is Pll 


20.10. (a) Let f{z) denote the branch of the function z 1/4 , defined everywhere 
except on the positive real axis, with the property that 

lim/(x + ie) = |x| 1/4 (a real quantity). 

R —* 0 


Evaluate /( — x) and lim c >0 /(x — is) for real x. 


Complex analysis 


Figure 20.19 



By sure to discuss the contributions from the circles. 



20.11. Use technique (c) on pp. 738-9 to establish the following integrals by 



integrating around the contour shown on p. 738. 

/•oo „a-l „ 



(a) 

-- JL 

- dx = - 0 < a < 1, 



V 

0 x + 1 sin an 

f 00 x g ~ 1 _ (1-qk 



(b) 

dx= — 0 < a < 2. 

o (x+1) 2 sin arc 



logz = logr + ifl 0<6<2n, 



valid whenever /(z) tends to zero as |z| -> co fast enough so that the 
contribution from the large circle vanishes. 

(b) Show that 

f 00 dx na + 2b log (b/a) 


log x , n log a 

-Z -- dx =-. 

^2 , _2 T „ 
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20.13. (a) Determine the Laurent expansion of 

5 


by contour integration around the unit circle. Where does this 
expansion conve r ge? 

(b) Find the same expansion by doing a partial fraction decomposition 

of f(z), then expanding each term in powers of z or 1/z as appropriate 
so that the series converges when \z\ = 1._ 

(c) Find the Fourier expansion of 

F(0) = (f + 2cos0) _1 . 

20.1 4 . Find the Fourier expansion of the function 

F(0) = 0 -n<6<n. 

Can this be obtained as the Laurent expansion of a holomorphic function? 
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Chapter 21 discusses some of the more elementary aspects of 
asymptotics. 


Introduction 


Many important laws of physics can be understood in terms of the asymptotic 
behavior of integrals. In this chapter we give some elementary illustrations of the 

and pre sent s ome importan t physi cal and 


mathematical applications. 

One of the earliest, and still most important, asymptotic formulas is 
Stirling’s formula: n! ~(27r) 1/2 n" +1/2 e“" 


discovered by James Stirling in the first third of the eighteenth century. Here the sign 
~ may be taken to mean that the ratio of the two sides (which are both tending to go 
with n) tends to 1 as n -*■ oo. We shall soon state and prove a more accurate version of 
his formula together with an estimate of the error. 

Although Stirling’s formula only makes an assertion about large n, it gives a 
remarkably accurate approximation (in terms of percentage error) even for small n. 
Thus for n = 1 we have n! = 1 while the right - hand side of Stirling’s formula gives 
0.92. For n = 10 we have 10! = 3 628 800 and the right-hand side gives 3 598 600, an 
error ofless than one percent. This is a recurrent phenomenon. In this chapter we 
will give estimates valid for a large value of a parameter, but the formula seems to 
work well for small values as well. 


21.1. Laplace’s method 


We illustrate the method by giving a refined version of Stirling’s formula. 

Integration by parts shows that the T function , defined by 


r°° 

r(x) = 

e l t x l dt x > 0, 

J 

0 
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satisfies 

r(x + i)=xr(x), r(i) = i 




In fact, we shall prove a more precise result. We shall show that 



etc., where the C ; are constants. In other words, whenever we break off the series, we 
can estimate the error in terms of the next higher power ofx -l . We begin by writing 
T(x) = T(x + l)/x so 



has the above Taylor expansion at the origin. Notice that p'(w ) = 0 only at w — 0 and, 
has an absolute minimum at the origin. The factor e -x x* occurs on the 


ur first observation 


e o 


:u 










integral comes from w near 0, the minimum of p. For example, if s is any fixed positive 


nur 


e xp(vv) dw < e x '* w2 dw 


< e 


2 > 2 d w. 


te can estimate this last integral very crudely by writing 


_ — vvc2/4. ^ 


for — oo < w < —e. So 



is a convergent integral, and, in fact C(.x)-»Q as x -> oo. Sc 





attention on Jt £ e * p(w) dvv for any e > 0. Now p( 0) = 0, p'(0) = 0 and p”{ 0) = 1. We can 




is 2 , i-e., 

pi w(s)) = s 2 / 2. - 

For convenience, we remind the reader how this is done. Since p(0) = 0, we have 



where we have set r — wu to get from the first integral to the second. Similarly, 

'i 

p'(vv) = w p”(wu)du 


- 


where 



Notice that q is a differentiable function of w andj?(0)= 1. T hus q(w) > 0 for w 
sufficiently small. Thus we may make the differentiable change of variables 

_ s = wjg(w) _ 


Let us choose our s > 0 small enough that this change of variables is valid for 



Mp ace s 


g < w < s. (Thus, so that q{w ) is positive on this range.) We can then make this 
change of variables in the integral f_ £ e _Jcp(vv) dw to get 



where e-> = e^/q(e) anc 



In estimating the integral, the only region that matters is the neighborhood of 
s = 0. For example, suppose we choose some function p, differentiable to all orders 
and such that p = 1 for | s \ < r] and p = 0 for s > 2t] and — ^ < —2rj,2rj<e. Then set 





e xs2/2 w' (s) ds = e xs2/2 b(s) ds + err or 

* £l % Cl 

where the error goes to zero exponentially with x, e.g., faster than some constant 
times e - " 2 *. (This is because the error will be given by integrals of the form 
j^ ei Q~ xs2l2 (b{s) — w'(s))ds and J* 2 e -xsZ/2 (&(s)— w'(s))ds and the integrand goes to 
zero faster than some constant times e -,,2x .) 


since b = 0 for s< —^ and s>e 2 . We are thus reduced-to estimating this last 
integral. 



We now apply Taylor’s formula with remainder to write 
b(s) = b 0 + b 1 s + b 2 s 2 + ••• + b N - l s N ~' i + b 
where b N (s ) is a smooth bounded function. Then 



e's ds + 


R n (s)= e xs 2 l 2 b N (s)s N ds 





1 < n 

*-06- 

p -xs 2 /2 | 

cl 

na s 







r*1 




where 


\ [a u / h v Qv mm ptrv 

1NUW, Uj a j imiiv'ti y 9 


D N = supjb N (s)j. 


e xs 2 / 2 s k ds = 0 if fc is odd, 


while, for even k, make the substitution 

x 1/2 s — y 
so 


e xs 2 , 2 s k ds = x ll2 x k e y 2 / 2 y k 


= LpX ' X 


C R = e y2/2 y k dy 
J ~ 00 

is independent of * (with C 0 = C l =J(2n)). Notice that 



Similarly, the remainder term can be estimated by some constant times x 1/2 x N/2 . 


Thus, tal 

king N = 2n even 

* oo 

e~ xs2 i 2 b(s)ds = | 

gl 

1/2 

1 (a 0 + a iX 1 +- h a„_ x x ” +1 -fe„(x)) 

* 

— 00 

\ x J 


where s„(x) = o(x "). Here a 0 = b 0 ,a 
dw/ds have the same Taylor expt 
expansion for p(w) = w — log(l + w 
More generally, suppose that we 

that p”(t 0 ) > 0. Let a(t) be some otl 

l = b 2 ,a 2 = bJ s J(2n), etc. Notice also that b and 
tnsion at the origin. Substituting the Taylor 
) gives the refined Stirling approximation, 
are given a smooth positive function p with a 
t 0 interior to its domain of definition and such 
ler smooth function and consider the integral 

r_^ 


a(t)e kp(t) dt. 





We can write this as 


e ~ fcp(t0) j a{t)e ~ fc(p(f > ~ p(to) >d t 

where now p(t) — p{t a ) takes the minimum value 0 at t = t n . We can thus apply the 
preceding method to conclude the principle of Laplace (1820): 




1/2 

( _, fl 1 , dr ,_ ^ 


4 

a(t)e * p(t) dr ~ == 

Jp (to) \ k J 


[a 0 + ^- + g + -j 



where a 0 — a(t 0 ) and the higher-order coefficients can be expressed in terms of a.p 
and their derivatives at t 0 . 


21.2. Th e method of stationary phase 

Let us now examine what happens in Laplace’s method when we replace — kp by i kp 
in the exponential. We will find that a similar formula holds, but that this time all the 
critical points of p enter, not just the absolute minimum. In Laplace’s method, the 
factor q p{w) integrand was decaying exponentially relative to its value near the 
minimum of p. For this reason all the asymptotic information was located at the 
minimum. When we consider an integral of the form 

fe ifcp(y) a(y)dy 

the factor e lkp{y) is not going to zero as /c->oo. But it is oscillating rapidly as a 






power of k. Indeed, consider the differential operator 

p . -i-i- = 

P’iy) 8y 




ie ranee where, 


De ifcp(y) = i ke ikp(y) 


e ikp(y) __ —(De ifcp(y) ). 
ik 


Then 




derivatives) at the end points of integration. The function c has all the properties of a 
so we can repeat the process to conclude that 



proves our claim. 




Let us now consider the general case. We will still suppose that a is smooth with 
compact support. We will no longer assume that p'(y) has no zeros. But we will 
assume that p has a finite number of critical points, y u ...,y r (points at which 
p’jy) = 0) and that p"(y) # 0 at each of the critical points. We then claim that 


2n 


e + 7t 1 /' 4 e ifep(.V)) 


a 


a(y)^(y)dy 


I 


k J criUca) y/\p"(yj)\ 


a 


o j 


i j 


points, yj 


where a 0 j = a{y j ) and the higher coefficients can be expressed in terms of the 
values of a, p and their derivatives at yj. The 4- sign in e ±ra/4 is used at minima, where 
p"(yj) > 0 and the - sign at maxima where p"(yj) < 0. In the above expression, p is a 
real-valued, and a may be a complex-valued function of y. This is known as the 
formul a of stationary phase. We shall prove this formula as follows. About each 
critical point, y h we can find coordinates snsuch that 


P(y(s)) = ± s 2 / 2 

(the ± sign depending on the sign of p"(yj)). In making this change of variables, we 
will have - 


dy 1 

ds y/\p"(yj)\ at yj ' 

This accounts for the factor l/y/\p"(yj)\ occurring in the stationary phase formula. 

About each point y } choose a function such that ^ = 1 near yj and (j)j = 0 
outside some small neighborhood where the changes of variables are defined. Then 

b — a~(a 1 (j) 1 + a 2 (t >2 + ••• + a r 0 r ) 






'- 1 

<p 1 

- - ^ 

i>7 

r 1 -’ 

<t>3 

-J 


L i 


^ 




Figure 21.5 


has the property that it has compact support and p'(y) ^ 0 anywhere that a(y) ^ 0. 
Thus the preceding result applies to b, so we have 


e ifcp ady = £ e ifcp (a^-)dy + 0{k~ N ) 






for any N. We may thus consider each summand separately. We may make the 


* 1_1_* _i ^ ..t 



already touched upon this subject in the previous chapter. 



Everything in this section stems from the basic fact that 



defines a n analytic function that may be evaluated by taking rj to be real and then 


using analytic continuation. For real rj, we simply complete the square and make a 
change of variables: 



1 

'(M 


= exp(/ 7 2 / 2 ) 


exp [ — (x + rj) 2 / 2]dx 


Gaussian Integra s 


This is true for all complex values of r\. In particular, taking y\ = if, we get 


1 

/* 



exp (— x 2 /2 — iCx)dx = exp (— < 

»'/2). 


In othe r words, the Fourier t ransform of exp( —x 2 /2) is exp( — ( 2 /2). N ow set 

/ ^ .1./- 


f = JU, we get 


TzvsTCTTnnrcxrai bhii m *ra anraivi k< 


e Xx /2 e l ^dx = — — —(e Xx /2 )e “^dx 

, JjjAxdx 

» -A R 2 l2 n -i£R n -XS 2 l2 n -i£S I'S 


■ Xx 2 /2 __ 

dx\ Xx 


In this last integral we can write 


-/U 2 /2 __ \ _ ^ -Xx 2 /2 

~ Ax dx 


e Ax2/2 e '^dx = O 


We thus see that 


-Xx*l2 -itxA 


the positive square root. In particular, if we set X = — ir, r > 0, then 

_ n/2 _ I 1 | l/2 „ -7ri/4 _ 


and, similarly with X = ir, 

' fe- 


rx 2 /2^ dx = C nijA-y l/2^i?; 2 /2r 





We can now complete the proof of the formula of stationary phase. We wish to 
find an asymptotic evaluation of an integral of the form 

” 00 

_ e ±,ks2/2 ip(s)ds _ 

J — co 

where ^ is a function of compact support: 

iA(s) = <A(0) + sx{s). 

Multiply this equation by some function p which is identically one where ij/ ^ 0 and 
vanishes for large values of | s |. Then 

Ms) = <Ao(s) + sp(s)x(s) 

where 


ij/ 0 (s) = p(s)if/(0) = ij/(Q) near s = 0. 

Thus 


i^ 0 (s)e ±lfa /2 ds = ^e ±l/4 + error 
k 


where the error term vanishes to infinite order in k. But 


s[p(s)x(s)]e lksZ/2 


1 


li o^sjlfe^ 


+ ik 


ds 


ik 


ds 


[p(s)x(s)]-e ±4ks /2 ds 


s |. We can thus repeat the process to yield the 
ic series, completing the proof of the formula. 

We can give an n-dimensional version of the Gaussian integral. 

Take _ 




6c ^ 






to be a vector variable and let 


Q = 

-±n o ^ 



V 0 - ± r n) 



be an n x n matrix with all the r t > 0 and sgn Q = number of + s minus number of 
-s. Then |r t = |Det(Q)|, and, by multiplying the formulas for the one- 

dimensional integrals, we get 


i 

r ./j,,. ) 




exp -Qx-xj 

rexpt- i?-x)dx = |Det(0| 1/2 - exp - sgn QT 

\“4- 7 



xexp^-~(2 

where now £ is also a vector and Q ~ 1 is the inverse matrix. We have proved this for 


Group velocity 761 

diagonal Q. But given any nonsingular symmetric matrix 0 ~ we can find an 
orthogonal matrix 0 such that OQO' 1 is diagon a l . This proves that the above 
formula is true for all such Q. 

We can use the n-dimensional version of the Gaussian integral to obtain an n- 
dimensional version of the stationary phase formula. Let p be a function with a finite 
number of critical points. We assume that each of the critical points of p is non¬ 
degenerate. Thus the Hessian, the matrix 


H(yj) 



is non-degenerate at each critical point, yj. Let a = a- 3 denote the signature of thejth 
critical point. Then 



a(y)e lkp(y) dv ~ 

Yht -e 7tiff/4 





{k) ^^/|Det(//(y f ))| 

V k ) 



Lins u^yj), u 2 (y J ), etc., can 


be expres s ed in term s ^Ta, p and their derivatives of various or ders at YJ- 


21.4. Group velocity 

As an application of the method of stationary phase in one variable, let us consider 
the following situation. Suppose we have a family of traveling waves 

e “ (1 lh)(E(p)t - px) 

where h is a small number so that 1 /h plays the role of our large parameter, k. The 
wave number of the space variation is p/h at each fixed time, t. Since we allow E to 
depend on p, each of these waves is traveling with a different velocity. Now suppose 
we superimpose such a family of traveling waves so that we consider an integral of 
the form 





a(p)e <1/ )< j(p) ' 


Let us further assume that the function a{ p ) is concentrated about some fixed value 
p 0 . In oth e r words, we assum e that a vanish e s exc e pt for values of p close to p 0 . Now 
the method of stationary phase says that the only non-negligible contributions to 
the integral come from values of p for which the derivative of the exponential term 
with respect to p vanishes, i.e., for which 

E'(p)t — x = 0. 

Since a(p) vanishes unless p is close to p 0 , this equation is really a constraint on x and 
t: it says that the integral is essentially zero except for those values of x and t such that 

* = E'(Po)t . 

holds approximately. In other words, the integral looks like a littl e blip, when 
thought of as a function of x, and this blip moves in time with velocity E'(p). (This 
blip is called a wave packet and the velocity E'(p) is called the group velocity .) 





Let us examine what kind of function E can be of p if we demand that the 
cpression E t — vx b e invariant under Loren tz transf ormations. Und er a Lorpr,^ 


c 2 t 2 — x 2 — c 2 t' 2 — x' 2 . 


Thus ( E,p ) can be transformed into any other (E',p) with 


y.2 — J 7 f 2 _ -2 n ?2 


invariant relation 


m E and p is of tl 


E 2 — (pc) 2 = constant. 


Let us call this constant m 2 c 4 so that 


\2 _ 


E(p) = ((pc) 2 + m 2 c 4 ) 1/2 . 


E\p) = 


pc = P 
E(p) M 


_ _ E(p) = Me 2 or M = (m 2 + (pc) 2 ) 172 7T 

(so that, if p/c is small in comparison to m, then M = m). If we think of M as a mass, 
then the relationship between the group velocity E’(p) and p is precisely the 
relationship between velocity and momentum. In this way we have associated a 


If we think of E 


1 h 

X = - = - (de Broglie’s formula). 
k p 


mency. v. 


E = hv (Einstein’s formula). 

In these formulas we have been thinking of h as a small parameter which we have 


constant). In Einstein’s formula it occurs as a conversion factor from inverse time to 
energy, and hence has units energy x time. It is given by 

h = 6.626 x 1(T 34 J s. 


ie Louner inversion formula 


Let us give an important application of the stationary phase formula. Let us take 
n = 2 with coord inates x and t. Consider the function_ 


p(x, £) = p„(x, Q = x(£ - 


The Fourier inversion formula 
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where rj is a fixed number. This function has only one critical point, at 

x = 0, £ = r/. 

This critical point is non-degenerate with signature a = 0. 


-1— 

I ft i , 

*)dvd< 

l =— [ a(0,rj) H—--)- ••• 


2n 

_& 



5 /cV k J 



iiik'KiiMiiinKmsK'fliJiuiHisiwii 


IVJ ■ •1*1111 VC1IIIJI1IJIRB 


proof shows that it works as well for functions which vanish at infinity, together 
with their derivatives at any order faster than any inverse power of k. In particular, 
we shall take 

a(x, £)=f(x)g(l;) 

where / and g are smooth functions of one variable, which vanish rapidly at 
infinity with all their derivatives. So we have 


1 



£\ — \kv(F — _l 

e ^ /’/m v..\ i i 

/ 1 N 


- 2n 


/My(< 

,)e ^ 'mdi 

. =t/(%W + 0| 

l 7T J 

• 


Let us make th e change of variabl e u = kx in th e integral. So 

1 .... / 1 \ 1 f ,fu\ . .. . . .. 


pq 


d 

nrv 

LhJ 

_27tk 


ul 


ii r / \ /i r \ 

■■ — 7 —- f I — )e , “' , | - g(£)e~ lu ^d£ )dw. 

kJ(2n)J J \kj \V( 2 ^)J 01 / 


Define g, the Fourier transform of g, by the formula 

i r.... .. .. 


g{u) = 


rifle-^d*. 


ius, substituting into the preceding formula, we have 


i i 

fl 

w \ 


(T-\ 


k J(2n) t 

J I 

Ui 

|y W c uw i^J wm'l) ^ w | 

is3 



suppose we choose f wit 


4)—<4 






Of course the same proof works in n dimensions to prove that 


g(n) = 


(2tt)" /2 


§{u)e iun du. 


21.6. Asymptotic evaluation of Helmholtz’s formula 


Let u be a solution of the r e duced wave equation (A + k 2 )u = 0 in three-space. Recall 
from section 19.6 that, if u satisfies the reduced wave equation, then outside some 
closed surf a ce 5. we have Helmholtz’s formula 


<P) = 



Akr 


*d u—u *d 



(19.2) 


for P outside S, while the integral vanishes for P inside S. As we indicated there, 
this was Helmholtz explanation of Huyghens’ principle - and the absence of back¬ 
ward waves. However, this vanishing inside S depends on contributions f r om the 
whole surface. Fresnel believed that if all the waves were inside S, the integral from 


erad 



/ grad <t>(y) = -grad r(y) 


\ * d (b( v) = — * dr( vO_ 


'grad r(y) 

P~y + r grad Q(y) 


(a) 


case (a) 



Figure 21.6 




symptotic evaluation of Helm o tz s formu a 
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each small surface element would produce a null effect at interior points due to 
interference. We shall now show, using stationary phase, that Fresnel was right, up to 
terms of order 1 Ik. In applying stationary phase, the function u , which occurs on the 
right-hand side of the formula will itself be oscillatory, and we must make some 
assumptions about i t s form before we can proceed. We shall assume that, nea r 
S,u = ue lk ^ where a and (j> are smooth, and that ||grad(/>|| = 1. This would be the 
case, for example, if u represented radiation from a single point, Q, lying inside S, 
where (f>(y) - y — Q. 

We shall assume that we are sufficiently far from S so that 1 /r 2 is negligible in 
comparison with k , and that a and da are also negligible in comparison with k. 
Substituting into (19.2) we find that the top-order term (relative to powers of k ) 
is 


i/c 


(i a/r)e ik{,l>+r X*d(f> — *dr). 


471 


Now the points of stationary phase are those points y, on S — where 
grad <f>(y) + grad r(y) is normal to S. There are two possible situations. 

T .et us suppose for the moment that y is a non-degenerate critical point of type (h) 
The top-order term in the stationary phase formula will vanish, and the total 
contribution coming from y in Helmholtz formula will be of order l/k. (Notice that, 


i f S we r e convex and g rad < / > pointed ou tw ard, then, for an y P insi de S, a l l t h e cr iti ca l 
points would be of type (b). This, in a sense, justifies Fresnel’s view that there is local 
cancellation of the backward wave.) For non-degenerate critical points of type (a) 


since *d(f>(y) = — *dr(y), we may, in computing the highest-order contribution to 
the stationary phase formula, replace the above integral by 


( a/r)e' k{(t>+y) d*r. 


This sh ows, that (up to order l/k) the induced secondary radiation along S behaves 
as if it 

(i) has an amplitude equal to 1/A times the amplitude of the primary wave 
where X = 2n/k is the wavelength, and 

(ii) has phase one - quarter of a period ahead of the primary wave. (This is a way 
of interpreting the factor i.) 


Fresnel made these two assumptions directly in his formulation of Huyghens’ 
principle, and this led many to regard his theory as being ad hoc. As we have seen, 
they are a consequence of the method of stationary phase and Helmholtz’s 

_ formula. _!_ 

We still must discuss the question of when the critical points are non - degenerate . 
We shall treat points of type (a); the points of type (b) can be treated in an identical 
manner. Actually, the discussion is almost the same as in our treatment of emitted 
radiation. Let us define the exponential map E: S x U + -> [R? by 


E(y,r)= y + rgrad d>(v). 





Then the critical points on S associated with a point P consist precisely of those 
y such that E(y,r) ^ P, wh e re r— \ \y — P\\. If grad <ft(y) is not tangent to S, then 
£ is a diffeomorphism near (y, 0). 

It is not hard to show that y is a degenerate critical point for P = E(y , r) if and 
only if (y, r) is a point at which the map E is singu l ar. In this case we call P a 
focal point of the map E at y. If P is not a focal point, then the index of the 
Hessian of </> 4- r at y is the number of focal points on the ray segment from y to 
P (counted with multiplicity). We leave the details to the reader. 


Summary 

A Asymptotic expansion of integrals 

Given a function defined by an integral of the form 

_ ■* _ 

e~ xp((a) d co, 


you should be able to identify what range of values of co makes the major contribution 
to the integral when x is large and to develop the first two terms of the asymptotic 
expansion of the integral. 


B Stationary phase 

Given an integral of the form 


a(v)e ikp(y > dv. 


you should be able to develop and apply a formula for the first two terms in the 
asymptotic expansion for large k. 

You should be able to develop the Fourier inversion formula by the method of 
stationary phase. 




21.1. The modified Bessel function of the third kind may be defined by the integral 


T 


ko(x) = — 


- x cosh t 


dr. 


Determine the first two terms in the asymptotic expansion of k 0 (x). 
21.2. The Bessel function J 0 (x) may b e defined by the integral 


Jq{x) = 


1 


C 2n 


.ixsinf 


dr. 


2 n 


Use stationary .phase to find the first two terms in the asymptotic 
expansion of J 0 (x). 




^xercises 



21.3. The beta function B 

, 1 

i(x + 1, x + 1) may be represented by the integral 

”i 

• I = 

[f(l — t)Ydt 

*} 

0 

*1 

pjcflogr + log (1 — f)j At 


e or. 

(a) For very large x, where does the principal contribution to the integral 
come from? 

(b) Find the leading term in the asymptotic expansion of B(x + 1, x + 1). 

(c) Describe carefully, and as explicitly as you can, how you would 
calculate more terms in the asymptotic expansion. Calculate one more 
term. 

21.4. (a) Describe the general strategy for obtaining an asymptotic expansion 
of an integral of the form 

/= 1 

"62 

e -i*pW q (0)dO. 


' & 1 

Assume that p(0) has only one critical point on [9 l ,9 2 ], at 9— a. 

(b) Apply this strategy to obtain the leading term in the asymptotic 
expansion of the Bessel function 

1 / 

J„(x) = - 
n ' 

( r* \ 
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Chapter 22 shows how the exterior calculus can be used in classical thermody- 
namyics, following the ideas of Born and Caratheodory. _ 


The subject matter of this chapter is equilibrium thermodynamics. This subject has 
several branches. The first of these - ‘classical thermodynamics’ - deals with the 
notions of heat and work and gives rise to the concepts of entropy and temperature 
and the celebrated ‘second law’. Tt deals with general principles but does not attempt 
to describe the behavior of substances or compute measurable quantities in terms of 
a microscopic model. A second branch - ‘equilibrium statistical mechanics’ - whether 
in its classical or quantum versions, provides very explicit formulas for all 
‘macroscopically observable quantities’ of equilibrium states in terms of a model of 
the atomic or mol e cular int e ractions. Thes e formulas are of a gen e ral natur e . That is, 
they all have a common underlying form, and are associated with the names of 
Maxw ell, Boltzma nn and espec iall y Gibbs. A third branch deals with explaining 
why these formulas work, and studies this problem from the point of view of 
probability theory or dynamics or both, and also deals with the question of 
approach to equilibrium. We shall have very little to say about this third branch. The 
first few sections will deal with classical thermodynamics. However, we will begin 
with a purely mathematical theorem due to Caratheodory which is geometrically 
quite plausible - although we will postpone some of the details of the proof to the 
appe ndix to this chapter. This theorem serves as the mathematical tool which 
impli e s the existence of the entropy function according to the approach to 





Caratheodory’s theorem 


thermodynamics given by Born and Caratheodory. We then derive some physical 
consequences of the theory and devote the remainder of the chapter to ‘equilibrium 
statistical mechanics’. 


Let a = A 1 dx 1 H-b A„dx n be a linear differential form in IR". Here the A x are all 

functions on U n . If we look at some fixed point P then the ‘value of a at P’ can be 
thought of as the row vector (^(P),..., v4„(P)). If this row vector is not zero, its null 

(x'\ 


_ ,4 1 <PtZ 1 + - + /l,(P»f" = 0. 

,et y = x(t) be a piecewise differentiable curve. Then 



In particular, if for every t the tangent vector x(t) = • lies in the null space of 

-W--— 

(/fY(x(t)),..., 7f„(x(t)))-thentheintegral| ) ,a — 0rAcurve _ yiscalleda7tMilcurt)(£?ofaify 
is continuous and piecewise differentiable and if, at every t for which x(t) is defined, 
x(t) lies in the null space of (.4 1 (x(t)),..., /f„(x(t))). We want to consider the following 
geometrical problem. 

Suppose we start at some point P. What are the points Q which we can join to P by 
null curves? For example, suppose that 

« = d / 

for some function /. Then if y is a null curve joining P to Q, we have 




/ = const 
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Thus we must have /(Q)=/(P). If d/VO, the set /(P) = const is an (n~l). 
dimensional surface passing through P, and the condition on Q is that it must lie on 
this surface. 

In particular, there will be points arbitrarily close to P which can not be joined to 

n 1 __11 ___ 


le it a = 


is some non-zei 


because y is a null curve for a if and only if it is a null curve for g~ l a = df. The 
conditions to be a null curve are the sa me . So a gain Q must lie on the (n — 1). 
dimensional surface / = /(P) if we are to be able to join Q to P by a null curve of a. 
Notice that if a = gdf then da = dg a df so 


a = dz + xdy 

defined on IR 3 . Here we have da = dx a dy so 

a a da = dx a d y A dz 

is nowhere zero. We wish to show that one can get from any point to any other point 
by a null curve of dz + xdy. It is enough to show that we can get from the origin, 0, to 
any other point Q. Let us write 

Q — b 

W 

Let us first assume that b^O. Now the x-axis is a null curve for a since both dz and 
dy vanish along the line z = y = 0, More generally any line parallel to the x-axis is a 
null curve for a for the same reason. So let us first move along the x-axis from 0 to the 
point whose x coordinate is — c/b during time 0 < t < 1. Then go from 



( — r/b\ 


( — c / h\ 




/ n \ 




0 

1 to 

b 

along x(t) — 

! i>y Ls 1 

0 ’ 

+ 

\t-l)b 

, 1 < t < 2. 





V 0 ) 


l)cj 




Since 


/6\ 

[ / \ 


x 1 (t) = 

b 

K C) 

and (-4 x (x(t)), ^ 2 (x(t)), A 3 (x(t))) = ^0, - p 1 ] 




we see that this is also a null curve. Finally, go from 



c/b\ 


fa) 


(-c!b\ 


(it ~ 2 )(a - c/b)) 




b 

to 

b 

alnpg y(Y) = 

h 

+ 

0 

J < t < T. 



l c J 


v 1 


\ c ) 



| 7 -- ^ ~ - 
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from 0 to Q along a continuous, broken path having three pieces, each a straight line 
segment which is a differentiable null curve. 
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If ^ — 0 and c^O we make a slight detour. First go to the point 



[ 1 1 



c 



loJ 



along a null curve. Since c^Q we can get to this point by the above procedure. 
Then from 


(\\ 

fh _i 

1 1 

/ 0 \ 

c to 

0 along x(f) = 

c + 

- (t - 3)c , 3 < t < 4. 


W 

voj 

v (*~3 )Cj 


Since along this curve dz = cdf and dy = — cdt and x = 1, we see that dz + xdy 
vanishes identically along this curve and so it is a null curve. We can then go to the 



We can thus consider three types of one-forms on [R 3 : 

Type Example 

(i) a with a ^ 0 but da = 0 a = dz 

_(ii ) a wi t h da 0 but a a da = 0 a = xdy 

(iii) a with a a da # 0 a = dz + xdy. 

In case (i) we know that if da = 0, then, locally, we can find a function f such that 
a = d/. (We can do so in any star-shaped region contained in the domain of definition ^ 
of a.) If a does not vanish at some point P then the implicit function theorem implies 
that we can make a change of variables so the function / becomes one of the 
coordinates, say the coordinate z. In other words, up to a change of variables, in the 
first row, the example represents the general case. The mathematical theorem that 
we wish to quote says that the same holds in the remaining two cases. We will give a 
statement and indicate the proof of this theorem later on in this section, but defer the 



then we must be in case (i) or (ii). This assertion, or rather its n-dimensional 
generalization, is the content of Caratheodory’s theorem. 

Caratheodory’s theorem. Let a be a linear differential form with the property that for 
anv noint P there are points O arbit rari ly close to P which can not be joined to P 


«= /dg. 

♦ * . ■ X 1 1 • XT • 


of one variable with nowhere vanishing derivative, and 
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then d G = H'{g)dg by the chain rule, so we can also write a as 
« = FdG where G = H°g and F=f/H'(g). 

Before continuing our purely mathematical discussion of Caratheodory’s 
theorem, let us sketch how it is actually used in th e rmodynamics. The deta iled 
discussion with the precise definitions and axioms will be given in the ne xt sect ion 
The quantity of heat given off by a chemical reaction performed in a specified way 
such as at cons tant volume or at constant pressure is readily measurable. (It can f or 
example be measured in a clever device known as an ice calorimeter which makes use 
of the fact that as ice melts it contracts, and it takes a specified amount of heat to melt 
a certain amount of ice under standard conditions.) The amount of heat needed to 
effect some small change (such as to raise the temperature of one gram of water by 
one degree on a standard th e rmometer, or to expand its volume) wh e n performed in 
some specified way is (approximately) linearly related to these small changes. So the 
‘quantity of heat added’ to a system in equilibrium is a linear differential form, cl. For 
a long period of time it was thought that this form was exact. That is, it was thought 
that there was a function, C (called the caloric), representing the ‘total amount of 
heat in the system’ such that a = d C. In other words, it was thought that the ‘caloric’ 
in the system was changed by the quantity of heat added. It was only gradually 
realized that the form a is not closed and therefore cannot represent the infinitesimal 

change in any function. _ 

On the other hand it was realized quite early that the work done on a system is 
given by a linear differential form co which is not closed. Indeed, if we compress 
a piston at high temperature, where the pressure is high, we do more work than 
compressing it at a low temperature where the pressure is low. Therefore, we can 
compress a piston at high temperature, then cool it off in its compressed state 
(doing no work since the piston is stationary), expand it when cool and then heat 
it up in its expanded condition. We will have gone around a ‘cycle’. That is, we 
will hav e traversed a closed curve around which the integral of co is not z e ro. (Of 
course to check whether or not a form is closed, an effective way is to integrate 
it around closed curves. This is why the ‘cycles’ play such an important role in 
the development of the ideas of thermodynamics.) 

The first law of thermodynamics asserts that although neither a nor co are closed, 
the sum is closed - that d(a + co) = 0. In other words, the first law states that (locally) 

a + co = d U 

where U is a well defined function on the system (determined up to an additive 
constant) known as the internal energy. The existence of U is a version of the 
physical principle of conservation of energy . 

The second law of thermodynamics derives from the behavior of a system when no 
heat exchange is allowed. If we put a system in an enclosure (known as an adiabatic 
enclosure) where no heat is exchanged, this means that we are constraining the 
system to change according to paths which are null curves of a. It is an observed fact 
of nature that then near every state of the system there will be nearby inaccessible 
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states. By Caratheodory’s theorem, this implies that there are functions / and g such 
that 

ot-fdg. 

Of cours e thi s does n ot d etermine f or g completely, as we m entio n ed in our 
discussion above. However an analysis of exchange between systems in thermal 
contact shows that we can choose / to be a universal function, T, of all systems 
kno wn as the a bsolute tempera tur e which i s d etermined up to a multip li cative 
constant (a scale factor). Having fixed the scale of the absolute temperature, this 
means that the function g is determined up to an additive constant for each fixed 
system. It is usually denoted by S and is called the ^entropy of the system. Thus the 
second law of thermodynamics asserts that there exists a universal temperature scale 


up to an additive 


constant once the temperature scale is fixed) such that 


The temperature function can be chosen to be always positive. With this convention, 


jsiri] iTiKfiVi 



Suppose the point Q has its w coordinate ^ 0, say 

w(Q) = r 7^0. 

We can first move from 0 along the line* = y = z = u = v = 0 to the point whose vv 
coordinate is r. We will then move entirely in the hyperplane w = r. Now on any 
curve in the hyperplane w = r, the form a restricts to the same values as the form 

a' = rdz' + xdy + udv. 


I KHTSTTCl TTKWa STiTiTTCVn Vi 
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z 1 + xdy + udv. 


e can now adjust me remaining uve variaoies using me mree-aimensional case 
as before. Similarly, if either the x coordinate or the u coordinate of Q does not 


>XaTT^l*S ■ ■ r£m^l kii 11 ■ i ■■■t/ITU.'a111 ■ ft 111< J I M.'l I 




coordinates. 

On the other hand, if Q lies in the three-dimensional subspace given by the 
equations x = u = w = 0, then any curve in this subspace is a null curve, so once 
again we can join the origin to Q. Thus in IR 6 we can consider six types of one-forms: 

Type Example 

(i) a # 0 but da = 0 dz 

(ii) da # 0, but a a da = 0 xdy 

(iii) a a da ^ 0 but da a da == 0 dz + xdy 

(iv) da a da 7 ^ 0 but a a da a da = 0_xdy -f udv 

(v) a a da a da # 0 but da a da a da = 0 dz + xdy + udv 


)ur arguments show that for examples (iii)-(vi) we can move from any point to 


■m a ii i muipj ■ igigr* ^ ai) > r* mi j ■•tiivjKijiipi i«i ij imiEiufeK 


has the form given on the right. Granted this fact (whose proof we shall sketch in the 
appendix) we can prove Caratheodory’s theorem (in six dimensions) as follows. We 
begin by showing that we must have 

da a da a da = 0. 

Indeed, suppose the contrary - that a is such that da a da a da is not identically 
zero. Then at some point P (and therefore in an entire neighborhood of P) we are in 
case (vi). Hence we can make a change of coordinates near P so that a has the form 
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him curve of a. However our hypothesis is that every point has nearby points which 
are unreachable bv null curves. Contradiction. So we must have da a da a da = 0- 
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down to prove that a a da = 0 and thus conclude Caratheodory’s theorem. A similar 
argument works in n dimensions. 
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22.2. Classical thermodynamics according to Born and 

Caratheodory 

In this section we present the basics of thermodynamics from the point of view of 


rn an 




w I-I--* 

motion machines. Rather it is based on ideas abstracted from everyday experience 
and some simply stated physical laws. In t his approach ‘heat’ is a d eriv ed c o ncep t. 


The theory is formulated in terms of standard concepts of elementary mechanics. 
But although the concepts of mechanics enter into the theory, the laws of elementary 
mechanics must be modified. It is the modification of these laws that involves such 
notions as temperature and heat. Then two simply stated physical laws together with 


solute 


temperature and entropy. 


iiTjTirj [gllf nsrn 


universe. This portion of the universe is called a system. The system can exist in 




enormously complicated. For instance, if our system consisted of a gas in seme 
enclosure, the gas might be in turbulent motion in which case we would have to 

its state. Thus we would expect that the collection of all states might be infinite- 
dimensional in an appropriate sense. In any event, we shall deal with ‘system’ and 
‘state’ as undefined terms in our theory. Later on, when we develop a ‘model’ for 
thermodynamics, such as (classical) statistical mechanics, the undefined terms of 


mwi mm maw 11 ■■ a a i 


Another term which is intuitively clear but which will be left undefined is interaction. 
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le universe or with other systems. 

The next basic assumption is that among all the possible states of the system there 




system with the rest of the universe are held constant then the system will pass 
through various states but tend to a definite equilibrium state (determined by the 
initial state of the system at the moment the interactions are held constant). 
Although the general states of the system are very complicated, the class of 
equilibrium states are relatively simple to describe and can be parameterized by a 
finite number of variables.' Thus the set of equilibrium states has the structure of 




states is a subset of the set ol all states of the system. 

The next assumptions single out certain types of interactions: There exists a 
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System 2 











Combine- 

d system 
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Figure 22.2 


form of interaction there is no observed macroscopic motion nor any exchange of 
material. The changes which do occur can be described as follows. Consider th e 
combined system of the two original systems as a new system. This combin e d system 
is the direct product of the two original systems in the sense that the states of the new 
system are pairs (p^p?) where p 1 is a state of the first system and p 2 is a state of the 
second. In diathermal contact the equilibrium states of the combined system 
constitute a subset of the set of pairs of equilibrium states of the original system. In 
other words, if p 1 is an equilibrium state of the first system and p 2 is an equilibrium 
state of the second, and the two system are brought into diathermal contact, the 

be in equilibrium but will tend to a definite 




Figure 22.3 
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equilibrium state (q u q 2 ) where q t is a new equilibrium state of the first system and q 2 
is a new equilibrium state of the second system. We say that the first system in state 
q i is in thermal equilibrium with the second system in state q 2 . 

It is an observed law of nature (sometimes called the zeroth law of thermody- 
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nave tnree systems ana n p 1> p 2 > anu P 3 ar c c qumorium siaies oi mese systems sucn 
that Pi would be in thermal equilibrium with p 2 if the first two systems were to be 
rought into diathermal contact and and p, would be in thermal equilibrium 








submanifolds by the implicit function theorem. These submanifolds 6 = const are 




on the isotherms. 

There exists another important class of interactions of the system with the 

* -111 . j * . .. 1 _ .1 
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that for such interactions the equilibrium is not disturbed unless there is some 
change in the configurational variables. If, for a period of time, all the interactions 
of the system are adiabatic, the system is said to be in an adiabatic enclosure. A 
familiar, everyday (approximation to an) adiabatic enclosure is a thermos bottle. 


urve y 101 




ureis 


called an adiabatic curve. The first law of thermodynamics is a generalization from 


one s 
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work is always the same no matter how this work is applied. There is therefore a 
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W= U(p') - U(p) represents the amount of work done on the system when the 
system is moved from p to p' along any adiabatic curve. 

If the system is not in an adiabatic enclosure and is brought from p to p' by some 
curve of interactions, then the total external work applied need no longer equal 
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system by the process. Here the work done and the heat supplied depend on the 
process and not merely on t he initial and final states p and p'. 

(This is the formulation of the first law of thermodynamics in the Born- 
Caratheodory approach. Notice that in this approach heat is defined as the 
difference between the change in internal energy and the work applied to this 
system. This idea, to regard heat as a derived rather than a fundamental quantity, is 
both the strength and the weakness of the Born-Caratheodory approach. It is its 
strength because it reflects the fundamental view in physics for more than a 
century - that the basic object is energy which is conserved when all its forms are 
taken into account. But its weakness is that it does not reflect the experimental or 
historical realities. In fact, it is usually the heat added to a system which is easiest to 



sure was changed by adding a specific amount of heat, this same change could be 


always the same, no matter how the work was done. So in the actual experiments it is 
the heat added which is the measured quantity, along with various types of work. It 
is a deduction from the existence of a unique mechanical equivalent of heat that 
allows us to conclude that the total work in bringing the system from one state to 
another is independent of the path if the work is done adiabatically.) 

The next basic notion in the theory is that of a reversible curve. A curve y is called 
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The intuitive meaning of reversibility is that the interactions are proceeding very 
slowly relative to the ‘relaxation time’ of the system, i.e., the time it takes the system 
to reach its equilibrium state. For example, suppose the interaction takes place 
through some piston and rod arrangement which changes the volume of a gas. If the 



equilibrium states of a system such that the work done on the system along any 
(almost) reversible curve y is (approximately) equal to the integral along y of the 
linear differential form_ 


co = v,dx, H-h v„dx„ 



Therefore the heat supplied along the path y is given by the integral 



a = d U — co. 

In c ase x 1 = V is the vo lume, it is customary t o call p = — Vi the pressure so th at 

co = —pdV+ v 2 dx 2 + • • • + v„dx„ 


a = dU — co — dU pdV — v 2 dx 2 


v„dx„. 





We now come to the celebrated ‘second law of thermodynamics’. It is an everyday 




precisely, given a system in an adiabatic enclosure, there are certain types of work 
that can be done on the sys tem, such as violent shaking or stirring, which are such 






that you cannot get your work back by any reversible adiabatic curve. The assertion 
is that this can happen near any equilibrium state. Thus the second law of 
thermodynamics asserts that: 

Near any equilibrium state p of any system there exist arbitrarily close 
equilibrium states which can not be joined to p by reversible adiabatic 
curves. 


This simp le assertio n has fa r-reaching consequences as we shall now se e. The law 
asserts that near any point of the manifold of equilibrium states there are arbitrarily 
Hosp; points which cannot be reached by null curves of cl. It follows from 
Caratheodory’s theorem that for any system there exist functions A and 4> defined on 
the manifold of equilibrium states such that a = Ad</>. 

Of course, for any given system, neither the function A nor the function </> i s 
determined completely by the equation a = Ad 4>. However, once we have fixed one 
such choice of A, then 0 will be determined up to an additive constant. In the next 
section we shall show that by considering how a must behave when we combine 
systems, we can conclude that there is a preferred choice of temperature function, T\ 
(determined up to a constant factor) so that we can take A = T for all systems. This 
function T is called the absolute temperature. In terms of some chosen empirical 



that we may choose A = T(0) for all systems. 

Once we decide to use the absolute temperature for our temperature scale, then 
for any system w e may w rite_ 


a = TdS 


where the function S is then determined up to an additive constant. The function S is 
called the entropy. 


22.3. Entropy and absolute temperature 

Suppose that two systems are in diathermal contact. We may assume that the energy 
involved in bringing the two systems together is negligible. There are no moving 
walls between the systems and no direct exchange of matter. Thus the total internal 
energy U at any state of the combined system is the sum of the energies of the 
component states 

U(PuP 2 )= U 1 (p 1 )+U 2 (p 2 ) 

for any state (p^p?) of the combined syst e m. Here U denot e s the internal en e rgy 
function of the combined system and JJ X the internal energy function of the first 
system while U 2 denotes the internal energy function of the second system. 
Similarly, the work forms are additive: 

co = co 1 + co 2 . 

This equation is to be understood in the following sense. co x is a linear differential 
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form defined on the set of equilibrium states of the first system. It can be considered 
as a linear differential form defined on the product space consisting of all pairs of 
equilibrium states. It simply does not involve the equilibrium states of the second 
system. Similarly ca 2 can be defined on the set of such pairs of equilibrium states. 
Hence + c o 2 is defined on the space of all pairs of equilibrium states. The space of 
equilibrium states of the combined system in thermal contact is a submanifold of the 
space of all pairs (in fact the submanifold specified by the equation Oi = d 2 ). Then the 
above equation asserts that co is the restriction of co i +co 2 to this submanifold. 



given by the equation 

—0i = ^2— 

Figure 22.4 


It follows that (in the same sense) 


Writing 


we obtain 


_a = oq + q 2 . _ 

a = Ad 0, q 1 =A 1 d0 1 , and q 2 = A 2 d0 2 


Ad 0 = Ajd 0! + A 2 d0 2 . 


(*) 


For any system, we can change the temperature (for example by changing V at 
constant pressure) adiabatically. This means that for system 1 the forms and d6 r 
are linearly independent, and similarly for system 2. As q x = ?. l d(j) l this implies that 
the differential forms d0 t and d^ are independent. By the implicit function theorem, 
this means that we can make a change of variables so that 9^ and (f) 1 are the first two 
coordinates, i.e. that the local coordinates on the manifold of equilibrium states of 
system 1 are (0 l5 </> l5 y u ...) and similarly for system 2. So local coordinates on the set 
of pairs of equilibrium states can be taken as - 


($1> $2> 4 * 1? 02^ • • • i Z 2> • • •) 

while the equilibrium states of the combined system (in thermal contact) are 
described by the condition &\ = Q 2 . So we can use 

(fl,^!,^; )>!,... ;z 2 ,...) 





function X does not vanish anywhere. So we may divide (*) by X to write it as 



This means that 

_ VA=/i(0i» 0 2 ) where fi=d F /d<t>i 

is a function only of ^ and (f> 2 and similarly 



log Ai - log X 2 = log (f 1 /f 2 ) 

where the right hand side depends only on 4> x and 0 2 - In particular, if we compute 
the partial derivative of both sides with respect to 6 the right hand side vanishes and 


we get 

_ dlogXi d\ogX 2 _ 

(10 dO ■ 


1, that is, it is a function of (0, 0 l5 y 1? ...) and the right hand side is a function of 
(0, (j ) 2 , z 2 ,...). The only way that they can agree for all values of the variables is for 
each side to depend only on 6 and to be the same function of 6. In other words, there 
is some universal function g such that 


for all systems. By examining some specific systems one verifies that this function g is 
nowhere equal to zero. 

Now suppose that we make a change of variables in the empirical temperature - 


lat is, ret 


len the coordinates are now (7, <p ,...) and we have 


<9(log X)/dT= [0(1 og X)/dQ~\ (&9/&T) = [0(log XydO^dT/dOy 1 . 


dT/dO = Tg(0) 


(***) 
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called an absolute temperature. Let us now go back over our entire discussion in this 
section but now use an absolute temperature, T, for the 6. Then for each system we 
have 

log = log T + log G, 

^ where G t is independent of T. Thus 

A 1 = TG 1 , A 2 = TG 2 and A —TG 

where the functions G 1} G 2 , and G are independent of T. Also we know that AJA 2 is a 
function only of 0! and <f> 2 . But A t is a function of the first system, hence independent 
of <j) 2 and similarly k 2 is independent of (f> t . Hence G* is a function of alone for each 
system. 

Let us drop the subscripts once again. What we have shown is that for any system 
we have 

« = TG((ft)d<ft, 

where G is a function (depending on the system) which does not vanish anywhere. 
But now we can solve the equation 

d S = Gdtj) 

which determines S up to an arbitrary additive constant. In other words, S is any 
indefinite integral of G. Then we get _ 

a = TdS. 

To summarize, we have proved that 

There exists a universal absolute temperature scale T determined up to a 
multiplicative constant. Fixing this constant and thus choosing T 
determines a function S on the set of equilibrium states of every system. This 
function S is determined up to an additive constant (one for each system) 
and is called the entropy. The heat form a is given by 

a = TdS. 


As we have already indicated, the existence of the function S and the equation 
a —TdS imply Kelvin’s formulation of the second law - around any closed curve one 
must have JdS = 0 and Thas a constant sign (which may be chosen to be positive). If 
heat is added along one portion of the cycle, heat must be extracted along some 
other portion and hence ‘no cycle can exist whose net effect is a total conversion 
of heat into work’. Thus ‘perpetual motion machines of the second kind’ cannot 
exist. 

A simpl e typ e of cycle, C, is a curve built up out of four portions, two of which, say 
C 3 and C 4 , are on surfaces where S is constant (and so no heat is exchanged) along 
one of which, say C 3 . the system is in thermal contact with a large system (called a 
heat reservoir) at a high temperature T u and along the fourth, C 2 , the system is in 
thermal contact with a cold reservoir at a low temperature, T 2 . Let Q 1 = J Cl a be the 
total heat absorbed by the system along C, and let Qi = — be the heat emitted 




along C 2 . Then = J Cl T< *S = 7\J Cl dS and g 2 = - T 2 \ C3 dS. But 



Now we have 


0 = 

*/ 

du — 

C 

f* 

(0 + 

C 

a -- 

c 

CO + Ql~ 0,2- 

C 

Thus if W denotes the work done by the s) 

astern so that 

f 

W= - 

%) 

0) 


then 


W= QiO ~ T 2 /TJ. 


This is, of course, the famous formula found in all the textbooks for the work done 
b y a Car not cycle operating between the temperatures T 2 and T v The actual 
statement of Carnot is that this is the most efficient way of extracting work from heat 
between these two temperatures. That is, that for any machine extracting the 


amount of heat Q x at 7\ and giving up heat at T 2 the above formula represents the 
maximum possible output of work. This stronger statement has to do with 
irreversible processes, and relates to the changes in the function S in the course of 


irreversible adiabatic processes. Let us formulate the general statement, and leave 





. ystems wit i one c 
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If two equil i br i um st a tes, o x a nd cr 2 , c a n be joined by a reversible adiabatic 
curve then we know that they lie on the same S = const surface, i.e. 5(0"!) = S(u 2 ). But 
suppose that we start with the equilibrium state o 1 and apply an adiabatic process 
which is not necessarily reversible, and end up at an equilibrium state o 2 - Thus the 
curve joining o 1 to o 2 might not lie in the submanifold of equilibrium states, 
although the end points do. For example, we might keep a fluid in a thermos bottle 
and stir a propeller in the fluid. After the stirring ends, the fluid in the thermos bottle 
comes to an equilibrium state, with a different value of S. Experience shows that then 
we cannot get back to the original state unless we extract heat from the system. Thus 
(if we have chose n the multiplicative constant so that T is positive) we have SfUj ) < 
S(o~ 2 ) for such an irreversible adiabatic curve. We can thus state this stronger version 
of the second law of thermodynamics as: 

if the equilib ri um states o x and <r 2 can be joined by an ad i abatic curve (that 
is if a process can lead from to <r 2 while the system is in an adiabatic 
enclosure) then _ 

S(<7i) ^ S(<r 2 ) 

with S(<r 1 ) = S(<t 2 ) if and only if o x and <j 2 can be joined by a reversible 
adiabatic curve. 


22.4. Systems with one configurational variable 

In this section we shall study the important special case where there is one 
configurational variable, and hence the manifold of equilibrium states is two- 
dimensional. Throughout this section we shall assume that the configurational 
variable in question is volume. The corresponding function entering into the 
expression for the work form is pressure and the work done on the system is given by 

t 

the diff e r e ntial form 

co = — pdV. 

However, the considerations equally apply to many other interesting physical 
systems. In many of the general theorems, all that has to be done is to replace volume 
by the appropriate configurational variable and the pressure by the appropriate 




Table 22.1 Systems with one configurational variable 



Configurational 

Generalized 

Work from co in 
joules = newton 

System 

vanaoie 

fox ce 

meters 

Gas 

volume, V, in m 3 

pressure p in N/m 2 

co — — pdV 

_ r r r \ t 

Wire 

lengui Lin in 

tension l m in 

CD = 1 QL 

Surface area 

Electrical cell 

area fi m m 

charge Z in coulombs 

surface tension o in 
N/m 

voltage E in volts 

CD — jfl/l 

co = EdZ 

Magnetic material 

total magnetic inuineiiL 
(total magnetization) 
M in amp/meter 

m dgne lie iieiti n m 
amp/meter 2 

CD — 



‘generalized force’ in all the formulas of this section. Table 22.1 lists some of th e 
important physical systems. We will make no further mention of them in this 
section. 

Before embarking on our general discussion, some comments are in order about 
the history and literature of our subject. Many of the most basic mathematical 
concepts that we use in this book such as ‘set’, ‘function’, ‘manifold’, ‘coordinates’, 
‘linear differential form’ were introduced (in their modern form), or became 
commonly known, after the fundamental discoveries in thermodynamics. Fur¬ 
thermore, most ofthe early heroes of our story (with the notable exceptions of 
Kelvin and Helmholtz) were not mathematically trained, or (Carnot) chose not to 
express themselves in mathematical language. As a result, most of the standard texts 
to this very day use a mathematically obscure language reflecting the early 
formulations of the subject, with a plethora of formulas involving relations between 
partial derivatives. Here is an example: th e notion of a linear differential form and its 
line integral together with a version of Stokes’ theorem (in the plane) were 
introduced to the world bv Ampere in the early 1920s in a breathtakinglv original 
series of papers combining important new mathematics with ingenious physical 
experiments. The more general version of Stokes’ theorem for a line integral of a one- 
form in three-space first occurs in a letter from Kelvin to Stokes in 1850. (Stokes 
published it as an examination question (!) some years later.) Yet in all his papers on 
thermodynamics Kelvin never once uses Stokes’ theorem, although it would have 
greatly simplified his arguments. 

Now we have expressed the heat added to a system as a linear differential form, a, 
on the manifold of equilibrium states. The functions T and V have independent 
differentials and so if the manifold is two-dimensional we can use T and V as local 
coordinates and write 

a = A v dV+ C v dT, (22.1) 

where A v and C v are functions on the manifold of equilibrium states. Similarly, the 
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functions p and T are independent so we can use them as coordinates and write 

a = A-dp + C„dT (22.2) 


where A p and C p are two other functions. (Strictly speaking we should, from the 
historical point of view, use an empirical temperature, 6, instead of T since we are 


ErnsnrnrsTiMas^ 
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But then we would have to rewrite all of the formulas with T when we return to the 
modem period. So we will let the reader make the various mental substitutions.) Of 
course the four functions are related to one another by the change of variables 
formula going from the V, T to the p, T coordinates, and we shall return to this point 
shortly. The underlying physical law expressed by the two preceding assertions was 
formulated as the ‘doctrine of latent and specific heats’. The function A v was known 




with respect to pressure’ and C„ was called ‘the specific heat at constant pressure’. 


confusion between the concepts of heat and temperature, and it was thought that the 
temperature of a body reflects the ‘total amount of heat that it contains’ (a 
meaningless concept, of course, once we know that a is not closed). If we add heat we 
raise the temperature. But if some of the heat added goes into expanding the volume 
or i ncre asing the pressure then it be comes ‘hidd en’ or latent’. Of course eq uations 
( 22 . 1 ) and ( 22 . 2 ) give clear mathematical formulations to rather mysterious sounding 
physical ‘doctrines’. 

In two dimensions many computations simplify because of the fact that if and 
Q 2 are two-forms with Q 2 7 ^ 0 then Q l =fQ 2 where / is a function. In other words, 
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spanned by £ and rj shows that 


/(*) = 


dV(& 


We can write the right hand side of this equation as 

f lOadiabatic 
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when evaluated on vectors tangent to adiabatics. Similarly, and with similar 










meaning 


d T a dp 
dT/CdF 


(dp/d ^Oisothermal' 


If we take the exterior product of (22.2) wi t h dp we get 


a a dp = C p dT a dp 

and the exterior product of (22.1) with dV gives 

a a dV = C v dT a dV. 

Dividing the first equation by the second gives the important result 

(dp/d K) adiabatic = y(dp/d K) isothermal where y = C p /C v . (22.3) 

If we introduce the density 

p = m/V 

we can also write (22.3) as 

(dp/dp) adiabatic = y(dp/dp) isothermal , y = C p /C v . (22.4) 

This result is important for the following physical and historical reasons. One o f 
Newton’s famous results was to show that his laws implied that the speed of sound, c, 
in a gas is given by 

c 2 = d p/dp (22.5) 

where Newton a ssumed th a t p a nd p a re rel a ted by the equ a tion - 

p = Np where N is a constant, (22.6) 

i.e. 

pV= const. (22.7) 

Now the pressure, density, and speed of sound (in air for example can'be 

independently measured, and Newton’s assertion that c 2 = p/p was repeatedly 
contradicted by experiment for the next 100 years. The speed of sound was found to 
be greater than that predicted by Newton’s formula by a factor of about 1. 4 . On the 
other hand, experiments by Gay-Lussac and others showed that equation (22.7) 
holds at constant temperature, but that the constant on the right hand side of (22.1) 
varies with the temperature. Thus, if we use the independent functions p and V as 
coordinates on the manifold of equilibrium states, equation (22.7) for differing 
values of the constant are the eq uations for isothermals, and we can write Newton’s 
formula as 


^ (dp/dp) i s ot herm a l . 


Laplace (in 1816) argued on physical grounds (essentially that the speed of 
propagation of the disturbance is far greater than the time it takes for heat to be 
that one should replace Newton’s formula by 


P(dp /dp ) ad ; abat j c y(dp/dp )j SOtberma j. 


( 22 . 8 ) 


To quote Laplace: ‘The true speed of sound equals the product of the speed 
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according to Newton’s formula with the square root of the ratio of the specific heat 
of air subject to the constant pressure of the atmosphere at various temperatures to 





a = TdS, and combine them as 

dll — T dS — p d V. (22.10) 

If we take the exterior derivative of this equation we get 

dT a dS = dp a dV. (22.11) 

How do we determine the functions T, S and U by observations, say in terms of the 




mi mm 




and adiabatics are given by equations (22.7) and (22.9). So if we introduce the 
functions 


a{p, V) = P I 


le isothermals are the level curves ol 


the adiabatics are the level curves of a. (22.15) 

Since t and T have the'same level curves, and since both have nowhere zero 
differentials, we conclude that T can be expressed as a function of t, i.e. that 


T = T(t) with T'(t) nowhere zero. Similarly, S = S(a). Our problem is to determine 
the functions T{t) and S(a). Now 

dT a dS = T'(t)S'(a)dt a da = dp AdV 

bv eauation (22.11). So we must use the explicit expressions for t and a as functions of 






lermodynamics 



determined up to a multiplicative constant. So with no loss of generality we may 
assume that T = 1 and then S'(a) = [(y — l)a] ~ 1 and so 

T=t+T 0 and S = (y — l)' 1 log a + S 0 , 

where T 0 a nd S 0 are constants. Re me mber that S i s only determ in ed up to an 
additive constant. But we still must determine T 0 and the function U. For this 
purpose we substitute the above values into (22.10) to get, after some computation, 

d U = (y- l)' i lT 0 dloga + dt]. 

We now call on an additional experiment - the Joule-Thompson experiment - which 
shows that if we allow a gas to expand adiabatically into a vacuum there is 
(essentially) no change in temperature. In such an expansion there is no heat added 


constant T n vanishes. Thus 


Of course the internal energy is proportional to the total amount of gas present, and 




TjjiiiBflariawtiaiHMiiiaifi] uit kiit*a riwBiJ irai rvurai ■■■ > i ■ikr/iiai'ViB r^iniaTJii^i 


scale. So let us write 


where n denotes the number of moles present, and T the universal temperature 
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into this scale need not detain us.) Then R is a constant conversion factor from units 
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that 

dU a dV= « a dV= C v d.T a dV 
and then from (22.16) and (22.17) that C v is a constant given by 

C v = {y — l ^nR. 

Of course then C p must also be a constant and 

C p — C v = nR. 

All of the above properties were derived on the basis of three assumptions which 
define an ideal gas: 

(i) the law of Boyle, Ga y- Lussac (equation (22.7)) which says 

pV=nRT, 

(u) That U is a function of T alone (the Joule experiment), and 
(iii) that y is a constant. 

Let us summarize our conclusions, but state them p e r mol e . That is, assum e n = 1 
and write lower case letters, v for V, c p for C p etc. to indicate that we are dealing in 
molar quantities. We then have 
Ideal gas laws. 

the isotherms are pv = RT, 

—theadiabatiesarepy^ — const.,— 
u = (y-l)~ 1 RT, 

c v = (y-iy L R, 

c p = [y/(y - 1 )]fl so c p -c v = R, 

s = (y — l) _1 i^log pv y + s 0 = Rlogv + c v log T + s' 0 . 

The quantities that can be directly measured for real gases are p, v, T, c v , c p and y. To 
measure c v the gas is put in a thin walled steel flask with a heating coil wrapped 
around it. The heat delivered is measured by the known current flowing through the 
coil and the temperature change of the gas is measured. To measure c p the gas flows 
at constant pressure through a similar heating coil arrangement and the difference 
between inlet and outlet temperatures is measured. The results of these experiments 
at low pressures, where the gas behaves approximately like an ideal gas, are as 
follows: 

All gases 

c v and c p are functions of T only 

and_ 

c p -c v = R. 

Monatomic gases such as He, Ne, A, and most metallic vapors 

c v is constant over a wide range of T and is very nearly equal to § R. So c p is 
very nearly equal to f R and y is constant over a wide range of T and close to 

5 . 

3 * 






Permanent diatomic gases such as 0 2 , N 2 , NO, CO, and air 

c v is constant at ordinary temperatures and approximately equal to 4 R. So 
c p is approximately lR and y is constant at ordinary temperatures and 
approximately f, and decreases as the temperature is raised. 

It is this value of j for y which gives Laplace’s account of the speed of sound in air. 
The true meaning of the fractions f,f, etc. will only become apparent in the 
framework of statistical mechanics. 

Now for real gases, we can carry out the entire discussion starting with equa t ion 
(22.14). We no longer make the assumption that the isothermals and adiabatics are 
given by equations (22.12) and (22.13) or that y is constant. But one can 
expe rim e ntally d etermine the isothermals and y and hence sol ve equation (22,3) 
(perhaps only numerically) to get the adiabatics, and then find T'(t) and S'(a) to g e t S 
and then U by solving equation (22.10). Over the past 150 years this information has 
been accumulated for various substances and is available in chart form. Fur¬ 
thermore certain functions in addition to the ones we have been considering are 
useful for special purposes. For example, the enthalpy H is defined as 

H = U + pV. 

It is important for the following reason. Equation (22.10) says that the heat added to 
a system which is held at constant volume is equal to the change in the values of JJ. 
But in a chemical laboratory, it is much more convenient to keep the pressure 
constant (say everything at atmospheric pressure) and let the volume vary. Now 
differentiating the above equation gives 

dH = g + Vd v 

so differences in H give the amount of heat added under constant pressure. Thus 
heats of chemical reaction are recorded as differences in enthalpy. 

A particularly useful type of emprical chart is a Mollier diagram in which p 
and H are used as coordinates and the level curves of S, V, and T are then drawn 
in. An example is given opposite 

There are several other combinations of U, p, V and S which give rise to functions 



Figure 22.7 The submanifold jjf is Lagrangian. Projection to the f}. v 
plane allows the introduction of /?, v as local coordinates. Similarly, 
projection to the U, V plane allows us to use U and V as coordinates. 
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statistical mechanics. First consider the Planck function Y defined as follows: 

Y=S-{RT)~ 1 U-{RT^-ipY. 

To write this function in more convenient form let us introduce 

P--^(RT)- 1 -(22 

and 


iversion is 

/• i \ — 1 


that we had absorbed R into the definition of T from the very beginning, so that we 
choose to measure temperature in units of energy and the gas constant disappears 
Since a = TdS has units of energy, and we are measuring T in units of energy, we see 
that S also becomes dimensionless with this choice of temperature scale. Thus both S 
and the Planck function 

Y=S-fiU-vV (22.19) 

(22.10), as 

dS = pdU + vdV. (22.20) 

It then follows from equation (22.19) that 


We wish to interpret equations such as (22.20) and (22.21) as follows: Consider a 

£ __ ... • i' 


exterior two-form 
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Notice that on all of [R 4 we have 

Q = difidU + vdV) 

and also 

Q = d(— Udp-V dv). 

A two-dimensional submanifold, j£f, of [R 4 is called Lagrangian if the two-form £2 
vanishes identically when restricted to . For any two-dimensional submanifold of 
1R 4 we can consider /? and v as functions. If their differentials are linearly independent 
when restricted to the submanifold, we can use ft and v as local coordinates. 
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Figure 22.8 Two systems with three kinds of contact. They can exchange 
heat (top) material (middle) and volume (bottom). 


and hence (locally) there is a function S defined on such that equation (22.20) holds 
on . Similarly, since - 

d(— Udp — Vdv) = 0 when restricted to <£ 

we conclude that (locally) there is a function Y such that equation (22.21) holds on. 
Thus we can formulate the combined first and second laws as saying that 

the manifold of equilibrium states is a Lagrangian submanifold of 
_ (P, v, U, V) space._ 

But let us carry the mathematical discussion a little further. Suppose we introduce U 

S. This means that we are given S as an explicit function of U and V. If we compare 
the equation 

dS = (dS/dU)dU + (< dS/dV)dV 


h ku m .i i .f. am .1 ^■ arranra r 


P = dS/dU and v = dS/dV. 
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type diagram with U and V as coordinates and with the level curves and values of 
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when S is given as a function of U and V We say that S is a generating function of the 


Similarly if we start with j 3 and v as local coordinates and are given Y as a 
function of p and v, then it follows from equation (22.21) that can we recover U and 
V as functions of p and v by _ 

U = — dY/dP and V=—dY/dv. 




map expressing p and v as functions of U and V.) 

We can play this game with still other choices of variables as local coordinates. 
For example, we can write (on all of R 4 ) 

Q-d( — Udp + vdV) 

and, correspondingly 

dZ= — Udft + vdV onjy 
where the Massieu function Z is defined by 


z^ s-pu. 


(22.24) 


Thus the Massieu function is a generating function for Sfin terms of the variables 
p and V. Coded into (22.24) are the first and second laws of thermodynamics 
together with a complete description of the equilibrium states of our system! 

When we get to the subject of equilibrium statistical mechanics, we will find that 
the functions S and Z are the truly basic ones. The function S will be given an 
interpretation in terms of probability theory whose significance extends far beyond 
the domain of thermodynamics. The function Z will provide the link between the 
microscopic theory and the observed macroscopic phenomena. That is, a very 
general construction will show how a model of the energy at the atomic or molecular 
level leads to a definite expression for Z as a function of P and V. (Much of the 
technical aspects of the subject then becomes the purely mathematical question of 
how to evaluate or approximate this expressi on . ) Of cou rse we will dev e lop t he— 
theory scras to apply to systems with more than one configurational variable. 

Before getting to statistical mechanics we need to develop some further ideas 

an amplification on a remark we made at the beginning. We mentioned that many 
physically measurable quantities are given as quotients of two-forms on the surface 
of equilibrium states, and illustrated this with the two ‘specific heats’. Here are two 
more: 


coefficient of thermal expansion 
a t constant pressure 

coefficient of compressibility 
at constant temperature 


dV a dp 
VdT a dp 

dV a dT 
VdT a dp 


We will leave some others and manipulations with them to the exercises. 


22.5. Conditions for equilibrium 

We return to the study of systems with an arbitrary number of configurational 
variables. Up until now we have defined the function S only on the manifold of 
equilibrium states. It is important to observe that certain functions, like temperature 
or pr e ssure, a r e only defined on the manifold of equilib r ium states. It does not 
make sense to talk of the temperature of a gas (as a whole) unless it is in equilibrium. 
It was one of the great discoveries of the subject that the entropy is a function 
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which is defined on the set of all states. That is, there is a function, Ent, defined 
on the set of all states of the system such that 

Ent (p) = S(p) 

whenever p is an equilibrium state. We shall not write down a definition of Ent 




states of a system. Once we get to statistical mechanics, where we will give such 
a description, we will write down an explicit formula for Ent, due essentially to 
Boltzmann. However, many of the ideas of this section were developed by Clausius 
and others prior to the discovery of the statistical interpretation. 

We shall make use of one property of the entropy function for two systems in 
contact. (We can regard this property as an axiom at present - it will be obviously 


of the new system is a pair of states p = (pi,p 2 ) of the individual component 

U(p)=U l (p l )+U 2 (p 2 ). 
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In order to derive Gibbs’ rule from the maximum entropy principle we make use of the 
physical observation that we can always add h e at to a system and so continuously 
increase both the energy and the entropy, and we can do so without changing the 
configurational variables. Suppose that p is an equilibrium state, and suppose 
(contrary to fact) that th ere were some other state a near p with Ent (a) = E nt (p) and 
U(a) < U(p). By adding heat we could gradually increase both the entropy and the 
internal energy until we reach a new state having the same internal energy as p but 
having a larger entropy, contradicting the maximal entropy principle. This 
establishes Gibbs’ principle. 

To apply Gibbs’ principle we make a number of preliminary remarks. Suppose 
that f ,f k are functions of the configurational variables of a system. We can then 
obtain a new system by fixing the v a lues of these functions. For example, imagine 
that the two systems in figure 22.9 were initially separated (and each had its own 
piston). Before the systems were brought together, we had a system which was 
simpl y the direct product of the two subsy stems, and so there werp four 
configurational variables, V x and V 2 , the volumes of each, and M t and M 2 , the total 
mass of the gas in each. (Let us assume for the sake of the discussion that we have one 
and the same type of gas in each subsystem.) By bringing the systems into contact as 
in figure 22.9 we have fixed the total volume and the total mass. In other words we 
have imposed constraints of the form / = const and g = const where 

f=V i + F 2 and g = M 1 +M 2 , 

cutting the number of independent variables down from four to two. 

Let us start with some system and obtain a new system by imposing a number of 
constraints on the configurational variables. The constrained system will have its 
own equilibrium states which can be characterized by maximum entropy or 
minimum internal energy, but of course, relative to the more restricted class of 
states - those satisfying the constraints. Suppose that we are given a subset, M, of 
the states of the unconstrained, original system which contains all the equilibrium 
states of the original system and suppose that Ji is a manifold. Furthermore assume 
that the function Ent and the constraint functions / 1? ... ,f k are differentiable 
functions on Ji. Let p be an equilibrium state of the constrained system, where the /s 
are held constant. The Gibbs minimum energy principle asserts that U must take a 
local minimum at p subject to the constraints 

Ent = const, = const,..., f k = const. 

Then the method of Lagrange multipliers implies that there exist functions 
2 0 , X k such that 

d U = X 0 d Ent + A 1 df 1 + - + X k df k .(22.25) 

For example, suppose we consider the case illustrated by figure 22.9, where we take 
to be the set of all pairs of equilibrium states of the individual systems, if they were 
not in contact with one another. Each subsystem when considered separately has 









two configurational variables, V and M. Thus for the two subsystems we have 


= i 1 a^ 1 — p 1 d Ki + anu uu 2 = J 2 u 1 3 2 -p 2 ol' 2 t/i2 aM 2' 

(The coefficient, p , of dM, is called the ‘chemical potential’.) The manifold Ji is 
six-dimensional since the set of equilibrium states of each system separately is 
three-dimensional. We are assuming that 

u = u l + u 2 r - 

and the additive property of the entropy function asserts that 

_ Ent = + S 2 . _ 

Thus 

d U = T 2 d5 2 + T 2 dS 2 — PtdVi — T p^dM ^ + p 2 dM 2 , 


dU = Xr 


since the constraint functions are + V 2 anc 
Since the differentials occurring on the 
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T i = T 2 ,p 1 =p 2 and p t =p 2 . 

Thus the temperatures, pressures and chemical potentials must be equal. It is clear 
that this argument generalizes to the case where we have several (not just two) 

different species of substance. So if the expression for the internal energy^ on the 
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the conditions for equilibrium become 


t 1 = t 2 = -; 

Pi =Pi = —, 


1 

1 

[ (?? ?fi) 
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These are then the general conditions for equilibrium and give us some further 


For example, let us consider a gas consisting of a single type of substance. We can 
ima gine e ach small region of t he g as as being a subsystem. If we think that each small 
subsystem is in a state of equilibrium (considered separately) then we would get a 
value of T, p and p for each such region. In other words, we could think of T, p and p 
as being a function on three-dimensional space which assigns^to each point the 
values of temperature, pressure and p in a small region about that point. The 
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constant: that the same temperature, pressure and p persist throughout. Now the gas 
as a whole, with a constant total mass, say, has only one configurational degree of 
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freedom, V. Thus we can use p and T as coordinates on the two-dimensional space of 
equilibrium states of the gas as a whole. This means that the value of p can be 
expressed as a function of p and T. Thus for a gas of a pure substance in equilibrium 
with itself, p is some definite function of p and T,p = p gas (p, T). Suppose this same 
substance can exist in liquid form. For th e liquid in equilibrium with itself we would 
get some other function, p liq (p, T). Now consider a combined system consisting of 
liquid and gas of the same substance. The gas and the liquid are separately in 
equilibrium and suppose th a t the system a s a whole is in equilibrium. Then 
conditions (22.6) imply 

P liq = : P gas> 

T. = T 

liq -* gas 

and 

Ahiq Avgas' 

Tf we let p and T denote the common values in the first two equations, the third 
e quation becomes 

Aiq(p» T) = p g Jp, T). 

Since p Uq and p gas are (usually) independent functions of p and 1, this last equation 
defines a curve in the p, T plane. In other words a liquid and gas of the same 
substance can only c oexist i n equ ifibriu m when the re i s a de finite relati on betwe e n 
temperature and pressure. If there were three phases, say gas, liquid and solid, then 
they could only exist at equilibrium if _ 

A^sol iP’ Ahiq iP’ "0 A^gas iP’ "0 

and these equations then (in general) determine a point (the so called triple point) 
with a definite value of temperature and pressure. Finally one cannot have four 
distinct phases of the same substance coexisiting in equilibrium. This was Gibbs’ 
famous derivation of his celebrated phase rule. It (and its obvious generalizations) 
are all contained in equation (22.26) which is an immediate consequence of equation 
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We now want to examine microscopic theories that can serve as a model for 
thermodynamics. We begin with a discussion of classical statistical mechanics 
wherein a ‘state’ is defined as a ‘probability measure’ of a certain kind and the 
entropy of a state will be ‘the amount of disorder in the state’. The mathematical 
foundations of the theory of probability were laid in the first third of this century, 
principally by Borel, Lebesgue and Kolmogorov. Again this was long after the basic 
idea s of statistical mechanics were put forth by Boltzmann and Gibbs, leading to 
another communications barrier. The basic language of the theory of probability is 
measure theory. We shall use this language without going into the theory. For any of 
the de eper results in probability theor y w hich involve i n finite number s of 
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alternatives the use of the full machinery of measure theory is essential. As we will 
only be using the most elementary facts, we will only have to use the most elementary 


notions 
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The whole machinery of measure theory is devoted to constructing a suitably rich 
class of sets, on which p is defined, and a suitably broad class of functions whose 
integral is well defined. In all of our applications the sets and functions will be so 
simple-minded that we need not avail ourselves of the theory. As we have mentioned 
it is essential to u se the full theor y when one wants to probe a bit deeper._ 

A probability measure is a measure which assigns the value 1 to the set M, i.e. 
p(M) — 1. This corresponds to the convention in probability theory that ‘probabilities’ 
take on values between 0 to 1. 

The fundamental concept of a ‘system’ in statistical mechanics is a measure space. 
The se t M, the collection of subsets, «$/, and the measure p are given to us by t he 
physics or the geometry underlying the theory. The measure p will not, in general, be 
a probability measure. But it plays the role of providing the ‘a priori state of 
knowledge’ of the syst e m. We illustrate with a series of examples. 

A. Finite sample space, equal a priori probability. This is the simplest example. M is 
ft finite set, sf is the collection of all subsets of M. Here p({m}) = 1 for every me M. 
This describes the situation where there are a finite number of alternatives and we 
have no reason to prefer any one to any other. 

B. M = the real line, !R, with its standard collection of ( Lebesgue ) measurable sets 
and its usual measure p. Thus p([a, bf) = b — a for any interval [a, bf The measure p 

course p is not a probability measure since p(U) = oo. Notice that if p is any 
_integrable function with_ 

—fob— 


p(x)dx — 1 


J — 00 

then p determines a probability measure,, 

pdx, on [R. 

*6 

nia,b}) = 

p(x)dx. 

a 


The measure p( = dx) thus determines a distinguished class of probability measures 
on [R. (In the technical language of measure theory, it is the class of those probability 
measures which are absolutely continuous with respect to dx. Not all probability 
measures on [R are of this form. For example, consider the probability measure, P, 
which assigns P({0}) = \, P({1}) = j, and P(A) = 0 if {0,1} n A = 0. Thus P corres- 
ponds to the probability measure in the case that either 0 or 1 is achieved, each one 
with equal probability. Clearly P is not of the form pdx.) 

For any measure p on a measure space and any integrable non-negative function 
p we obtain a new measure, which we denote by pp by setting 


_/L A_ 

/% 

PP(A) = 

% 

PP- 
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This suggests the following general definition: 

Definition: Let (M, j /, p) be a measure space. A ( statistical ) state of (M, sd, p) is a 















probability measure on M of the form pp where p^Oisan integr able function. In other 
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Examples, continued 


lx where n(x) = U tor x 


(x) = I tor x 


corresponds to the situation where we are sure that the real number is nor '' 


take M=IR + = {x|x>0} since the measure p is concentrated there. 

D. Occupancy for c l assical particles. Here M =N = (0, 1, 2, 3,...} and p({k}) = 

1 /k\. The measure p assigns the weight \/k\ to the possibility of k particles 
occupying the box. Let us explain why this is an appropriate measure. Suppose we 
had many, say n, boxes and several, say JSI, particles. The number of waysliT 
distributing the particles so that k l particles lie in the first box, k 2 in the second, etc., 


ki!...k/ 

(Here, of course, the particles are considered as ‘distinguishable’: interchanging a 
pair of particles between different boxes given a different way of distributing the 
particles. There are N\ different permutations of the particles, but we must divide by 
kf.... k n \, since permuting particles within a box does not give a different way of 
distributing particles.) If we are unaware of N and n, the best we can say is that the 
number of ways of distributing the particles so that k x end up in box number 1 is 
proportional to l/Tq!. This is our a priori assignment of relative probabilities. 

E. Occupancy according to Fermi-Dirac. We have a box which can contain at most 
one particle. Here M — { 0,1} and ^({0}) = ^({1}) = 1. This, of course, is a subcase of 
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and p({k}) = 0 for all k > 1. The scheme represents occupancy with an ‘exclusion 




applications, the ‘box’ may be rather abstract. It may represent a single particle 
Quantum state, for example, for particles obeying the Pauli exclusion principle. 


F. Occupancy for Bose-Einstein particles. Here M = N = (0,1,2,3,...} and 
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example, a particle represents a disturbance of medium (a blip on a screen, for 
instance) and two particles represent a disturbance of twice the intensity, then in 



type. The Fermi-Dirac type particles, for example electrons, protons, neutrons, are 




called bosons. 

G. Binomial measure. Suppose that we have a large number, say N, of particles, 
each of which can be in one of two possible states ‘up’ and ‘down’. Suppose we are 









affected only by the total number of particles in the ‘up’ position minus the total 
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for simplicity, that N is even. Then #(up) - #(down) = 2m means that (N/2) + m 
particles are up and (N/2) — m particles are down. This can happen in 
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= +m 

[ ^ j 

T + m 
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different ways. 

If we are interested in the relative frequency of occurrence of 2m we divide by the 

. . .. /JV\ . 


largest binomial coefficient which is J and thuS define 
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r(N/2)!l 2 

H({2m}) = 

N j 

In 

_ L\ J J 

(N/2 + m)\(N/2 - m)!' 
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An application of Stirling’s formula N ! = N(2nN)-N N e N shows that for large 

i/olnoc /-if AT tlvic laet f>Yr»r/=ccir»n ic fairlv r*1r»C£»1v tmnmvinmhv o~ 2m 2 /iV 


This approximation suggests 

H. Discrete Gaussian measure. We let M = Z = the set of all integers and 

n{m} = e~ 2m2/N . 
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Suppose we are interested in the quantity m/iV.^That is suppose each ‘up’ or ‘down’ 
state contributes an effect of order + 1/AT to the quantity that we measure.) Then 
m/N is extremely sharply peaked about the origin if N is large. Thus, for example, if 
N ~ 10 24 then n({m)) will have decreased by a factor of e" 1 by the time that m/N has 
moved from the origin by about 10 ~ 12 . 

To actually ‘see’ the distribution we have to change the scale by a factor of N 112 
rather than N. If we set x k = 2k/N 1/2 then 

b 7“ 


H{a x k ^ b} = 2 ,e 

a 

The sum on the right grows as approximately N 112 . Indeed, if we divide b 



coordinates (canonical coordinates) q,p and a distinguished measure (Liouville 
m e asure) 

p — dpdq. 

For example, for a single classical particle moving in U 3 , the coordinates q u q 2 , q 3 
describe its position and the coordinates p 1; p 2 ,p 3 describe its momentum. Then 
M = U 6 and 

dpdq = dp i dp 2 dp 3 dq l dq 2 dq 3 , 
that is R 6 with its usual measure. 


22.7. Products and images 

In this section we are going to discuss various ways of constructing systems. The first 
n otion w e wish to discuss is that of product. Suppose that we are given two systems, 
{M. 1? i. Pi) and (M 2 ,s/ 2 ,p 2 ). W e can th e n form the product system ( M 1 x M 2 , 
stf x x j/ 2 , px x p 2 ). For example, suppose (M 1} «j/ l5 Px) and (M 2 , stf 2 , p 2 ) are as in E , 
that is, they represent the possible number of particles in a box (figure 22.10). Then 



M({ *}) a TT = 7; 

Figure 22.9. 


(Mx x M 2 ,jz?x x ^ 2^1 x Pi) represents the system consisting of the pair of boxes 
together (figure 22.11). 
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(MiX/x 2 ) (k, /) =~ 

Figure 22.10 


The measure p x x p 2 is characterized by 

(Px x p 2 )(Ax x A 2 ) = Px(Ax)p 2 ( A 2 ) 

for product sets A x A 2 ,A i ej^/ i ,i= 1,2. A state p, on the product system need 
not, of course, be of the form p x x p 2 . If p is of the form p = Pi x p 2 then the 
corresponding probability measures (p x x p 2 )(Pi x p 2 ) = pxPi x p 2 p 2 on M x x M 2 
is also a product measure. This corr e sponds to an assignment of independent 
probabilities to the events of M x and M 2 . 






Similarly, we can form the product of any collection of systems (M h srf h /^) where i 


Another way of constructing a system is via a map. Let (M, s/, pt) be a system and 
let ( N , 38) be a measure space, that is AT is a set with a distinguished class of subsets. 




to the collection j/. We can then define the rule f*ji which assigns to every Be38 the 
value 


n sets in 38 is a measure, then it is ca 
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measure fi under the map f. Then we have a new system (N, 38, f^) which we might 
call the pushed forward system. For example, suppose we begin with the product 
system M = x M ? of two systems of type E that we have just been considering. 
Let N consist of the large box with the partition removed (figure 22.12). 


where 


/((*, 0 ) = k + l 



Thus the map / has the effect of ignoring the fact that there are k particles in the 




assertion that there are k +1 particles altogether in the larger region. Here, for 



Both the product construction and the map construction become more intuitive 





system corresponding to a box of volume V by setting 










Mk,x/i k , «*.*>) = — 

Figure 22.13 

Then we can remove the partition and consider the box of volume F ;l + F 2 
(figure 22.15) 



Again we set N = M Fl+F2 and f:M Vl x M V2 -+N by f((k,l)) = k + l. Then 
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Let us return to general considerations. Let be a system and give an image 

system /*//). If a is any state on {N,3S, F^p) then the function — defines a state on 
(M. j/. u). Indeed f*a > 0 and, it follows from the definition of integration that 


f*an= \p(f u) = l. 
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A state p on M also determines a state f*p on N but the definition lies considerably deeper. Before 
stating the result in general let us illustrate by an example. Suppose M 1 =3t 2 with its standard choice 
of sd and p (= dxdy). Let f'M 2 be given by f(x,y) = (x 2 + y 2 ) 1/2 - Thus f*p = 2nrdr. Let p(x,y) 
be any continuous function on the plane with [pdxdy = 1. Thus p is a state. Define the state f*p by 


(f*P)(r) = — 

2n 

p(r cos 0, r sin &) d 0. 

2n J 

0 


Thus p{r) is the ‘average’ of p over the circle of radius r, i.e. the average of p over / 1 (r). Then 
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f *p)(r)2nrdr 
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p{r cos 0, r sin 0)r dr dO 
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^ CO 
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p(x,y)dxdy = 1. 

— 00 


Thus f*p is a state on N. Notice that /*/*<? = a but, in general, /*/*/? need not equal p. In the 
general situation, if p is a state on M then pp is a probability measure on M. Therefore fj,pp) is a 
probability measure on N. Furthermore, if Be& is any set such that 

(./»( / *) = 0 

then 

p[f-\B)) = 0 


and hence 




(/>/*)(*) = (w)(/ _1 (5)) = 


PP = 0 . 
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Th us (f*p )(B) = 0 implies that f*(pp)(B) = 0. A theorem in measure theo ry, call ed th e Rado n- 
Nikodym theorem, asserts that if v 2 and v 2 are measures such that v 2 (B) = 0 implies v 1 (B) = 0 for all 
B then there exists a function a such that v^jB) = f p gdv 2 . Taking v t = f*(pp) and v 2 = f*p we conclude 
that there is a non-negative integrable function tr on N such that 


(Upp))(B) = 


pd(f*p) 


" B 


for any B. Since fj^pp) is a probability measure we conclude that = 1- Thus a is a state on 

(N, ^,/^p) and we define 


f*P = a. 


22.8. Observables, expectations and internal energy 

A p a rticular kind of map is an ordinary numerical valued function Such a 

map is called an observable of the system. Roughly speaking, an observable of a 
system is a numerical property of the system that we can measure. We shall, when 
discussing theoretical points, usually denote an observable by the letter J. Specific 
observables, as they arise, will be denoted by letters connected with their physical 
significance. Of course, as part of our definition of the concept of observable, we 
assume that J -1 ([u, b]) belongs to the collection for every interval [a, b] on the 
real line. This is a regularity assumption about the map J. 

Let J be an observable and let p be a state. For each interval [a, b] we can consider 











the subset J *([a, b]) 0 f M and then the integral 



line. The probability assigned to any interval being the probability of measuring the 
observable J in the interval when the system is in the state p; it is given by the 
preceding formula. 

There is some useful language in probability theory that we shall adopt here. The 
_integral_ 



is called the expectation, or expected value of the function J with respect to the 
probability measure pp. It is an average of J where we weight subsets of M according 









Suppose we want to consider the energy to be that associated ‘free particle 
constrained to lie in a box, B, of volum e V\ We can do this by assuming an 
appropriate form for the potential V. Let us take 'f to vanish when (q 1 , q 2 , q 3 ) lies 
inside the box B, and to have some extremely large value when (q 1} q 2 , q 3 ) lies outside 
the box, B. The idea is that a particle has to overcome some huge potential harr i e r to 
get out of the box. We would then plug this choice of 'V into the expression for H in 
(22.29). Now suppose that the state p is such that it assigns negligible probability to 
extremely high values of the energy. That is, assume that Prob (H > E; p) is 
essentially equal to zero for E very large. To be precise, let us assume that this 
probability assignment is so small that in computing (22.29) we can ignore the 
contribution to the integral coming from values of (q u q 2 , q 3 ) lying outside the box. 
Inside the bo x i r = 0 and the integral becomes 


„ i(p 2 i + P 2 2+ pl)p(<h, <h> 43, Pi, Pi , P3)d4id42d4 3 dpidp 2 dp 3 . (22.30) 

Jfi x R 3 

To go further we need more information about the state p. For example, suppose 
that we expect that once we know that the particle is in the box, it is just as likely that 
it be at one location as at another, that is, p does not depend on (q u q 2 , q 3 ) but is a 
function of (PuPi’Ps) alone. Then the integral (22.30) becomes 



UpI + pi + pi)p(Pi,P2,P 3 )dpidp 2 dp 3 . 


(22.31) 


Su ppose that we are tol d t hat p is given by the so-ca lled Maxwell velocity 
distribution law with parameter /?; that is suppose that 

P(Pt, Pi, P 3 ) = Fp 1 exp [- M )(P i + p\ + p! )] 
where F p is a constant of proportionality depending on /?. We must choose F p to 
make the total integral f M pp = 1. So we must choose F p = \ M pp, or 

F p =V(2np)' 3/2 

and (22.31) becomes 


r 


m*) m 


Wi + P 2 + pf)exp [ - 0?i)(pf + pi + pl)]dp,dp,dp 


3 - 


This is a Gaussian integral which we can evaluate to get (3/2)/? \ Thus with all these 
choices we would obtain 

E(H;p) = (3/2)r 1 . 

How are we supposed to think about all of this? From the point of view of 
mathematics, formulas (22.29)-(22.31) represent cut and dried integrations. They are 
standard procedures in the mathematical theory of probability. From the viewpoint 
of the kinetic theory of gases we might think of an enormous number of gas molecules 
confined to a box, but otherwise not interacting with one another. We then might 
think of p(q u q 2 , q 3 ,p u p 2 , p 3 ) as measuring the relative frequency of gas molecules in 
a small region centered about the point (q u q 2 , q 3 ,Pi, p 2 , p 3 ) in phase space. With this 



interpretation the averaging process going into the definition of E(H;p ) is an 
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E(H; p) as the average energy per gas molecule. 
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interpretation. We might be studying a system consisting of a single gas molecule in a 




the probability of the energy taking values in any interval, or of the particle having 
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Of course, if we want to probe any deeper we must investigate the general meaning of 
the word ‘probability’. Unfortunately, although the mathematical foundations of 
the theory of probability were clarified over fifty years ago, the same cannot be said 
for the philosophical interpretation. Writing today, in 1987, there is still a raging 
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the frequentists who restrict the range of application of probability to situations in 




providing rules of thought - the theory of probability expressing how every right 
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members of the other as being ‘muddle-headed’ etc. This philosophical argument 
has real consequences in actual statistical practice. The different views lead to 
different statistical procedures, especially xvhen. dealing .with small samples. 
Fortunately, as far as statistical mechanics is concerned, because the samples 
involved are extremely large, these debates are irrelevant. All that matters is the 
mathematical aspect of the subject, that is, computations with the definitions we 
have given. However, Gibbs, writing a hundred years ago, before the mathematical 


He seems to have leaned toward the frequentist position. Therefore he would talk of 






raiiiiHiiawiiwiiHiMWM 


nology and the word ‘ensemble’ as a synonym for probability measure have gotten 
stuck in the standard texts. 

Whatever the interpretation, we are now prepared to make the first link up 
between the notions of statistical mechanics and thermodynamics. If the observable 
we are considering is regarded as the energy, H, then 

the internal energy of the state p is defined to be E(H; p). 

In other words, the internal energy is the expected value of the energy in the given 
state. Of course the energy function H is a fixed function on M. There are various 


definition as 


so the internal energy is a function on the space of states. 






(assumed to be 0 inside the box and essentially ‘+ oo’ outside the box) depends on 



“V = ‘+oo’ outside 

Figure 22.15 Dependence of 'V on parameters. 


Thus U depends on these parameters as well, 

U=U(p,Qi,Q2, ..,&)• 

states? Once we have given the definition of entropy, we will use the maximum 
entropy principle to define and determine the equilibrium states. We will do this in 
general in section 22.11. But let us give a preview here for our special case of gas in a 
box. 

For each positive value of [1 consider the function e~^ H on [R 6 . (Here /? is a 
parameter which has units of inverse energy, so that pH is a numerical-valued 


hriTTrin'-nTT 


outside the box, the function e pH vanishes when i 


l is outside the box, and is 


’- 1 a ~PH. 


is chosen to make the total integral 1, that is 


F= e ^ H dq 1 dq 2 dq 3 dp 1 dp 2 dp 3 . 


(22.33) 
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Here F depends upon p and also upon Q u Q 2 ,. ■ ., Q d , but only since V does. The 
depe ndence of F o n the Qs is onl y through V. The basic defin ition (in this s pecial 

ca se) is to a ssert that the states de fined by _ 

Pf>, Q = F^- m (22.35) 

where F = F(J3, Q t ,Q 2 ,..., Qd) is given by (22.33) are the equilibrium states_ 

Let us define 

_ Z = log F. (22.36) 

We may compute dZ/dfi by using (22.36) and differentiating under the integral sign 
in (22.33) to obtain 

_ r _ 

dZ/dp = F~ l dF/dp — F~ 1 - He~ fiH dq 1 dq 2 dq 3 dp 1 dp 2 dp 3 


= ~ H Pp,Q d( h d <l2d(hdPidp2dp 3 = - E(H; p PtQ ) 


(22.37) 


dZ/dp =-U. 


(22.38) 


slowly move one of the pistons, to change, say, Q v The corresponding change in H at 
so me definite p oint of M, the func tion d H/dQ 1; would be the cor responding 
theoretical-force term. At any given state p we must compute the average value to get 
the generalized force 

E(dH/dQ i;P ). 

Reversibility means that we are restricting the ps to lie on the manifold of 
equilibrium states. Hence the work form is given by 

© = E{dH/dQpp)dQ, + E{dH/dQ 2 ; p)dQ 2 + • • • + E(dH/dQ k ; p)dQ k . 

Again, computing the derivative of Z with respect to the Q b we can write this as 

- P<» = (dZ/dQMi + ( dZ/dQ 2 )dQ 2 + - + ( dZ/8Q k )dQ k (22.39) 


d n z = — Pco 


(22.39a) 


le d Q m (22. jya) mear 


operator in the Q directions (holding P fixed). Notice that our derivation of 
equations (22.37)-(22.39a) was quite general. It did not depend on the specific form of 
H. If we now assume that we are dealing with ‘free gas in a box’ where the function F 
and hence Z depend on’the Os only via the volume V, we can write dZ/dQi = 


_ - pco = (dZ/dV)dV. 

Of course an increase in V corresponds to a decrease in H (since points that were 
outside the box where 'V = + oo are now inside the box where V =0). With the usual 


equation becomes 


dz/dv= Pp = v, 

which shows that Z has all the properties we ascribed to the Massieu function. 

In short we now have a rather general prescription for computing the Massieu 
function. Given a system (M, j/, p) and a function H on M interpreted as the ‘energy’ 
define the partition function F by 


F = 



(22.40) 


The function H, and hence F, may depend on some auxiliary variables Qi,...,Q k . 
Also the integral in (22.40) will converge only for some range in (/?, Q lt ..., Q k ) space. 
Define the internal energy by (22.32), the equilibrium states by (22.35) (with F given 
by (22.40)) and the Massieu function by (22.36). Then (22.38) holds and the work 
form co is given by (22 . 39 a ) . This is the prescription for passing the ‘microscopic 
model’ coded into the function H to the ‘macroscopic equilibrium phenomena’ 
generated by the Massieu function. It is an explicit prescription, but we still must 
justify it This will mme once we lindersand the probabilistic definition of entropy, q 
subject whose study we shall begin in the next section. 

There is one slight generalization of what we have been doing which will prove 
convenient. Suppose that instead of considering a single observable, we want to 
consider several, (J l5 ...,J n ), at once. We can regard J = (J 1 ,...,J n ) as a single 
vector-valued observable with values in U n . Thus we want to allow W-valued 


observables where W is a vector space. A W- valued observable is just a function from 

l as in 


the R valued case). The notion of an integral of a vector valued function makes good 
sense so the definition (22.28) of the expectation of a vector-valued observable with 
respect to a state carries through without change. Similarly we can talk of the 
probability (in a given state) for J to lie in some (nice) subset of W. We need not 
elaborate on these points. 


22. 9. Entropy 

We now introduce the notion of the entropy of a state. Let (M, srf, pi) be a system. We 
wish to assign a number, Ent(p), to each state, p. The number Ent(p) is to be a 
measure of the ‘disorder’ of p relative to pi. Thus Ent is to be a function on the space of 
all states. In order to motivate our definition, we first examine the case of a finite 
sample space, case (A) of section 22.6. If M = {e u ..., e k ] then a state p is specified by 
giving the k real numbers p t = p(e f ), i = 1,..., k. Thus p t is the probability of the event 
{g;}- In information theory, the entropy of the state p = (p t ,... , p k ) is defined as 

Ent k (p) =~Y J P i log Pi (22.41) 

i = 1 

This function has various properties which make it very attractive as a measure of 



‘disorder’ or lack of information. We list some of them: 


Ent k (p u ...,p k ) is symmetric in the (p l5 ..., p k ). 

This corresponds to the requirement that our measure of disorder not depend on the 
way we label the outcomes. 

(ii) Ent fc (l,0,...,0) = 0 

(Here we define x log x = 0 for x = 0 by continuity.) 

This corresponds to the assertion that a state in which we are sure of the outcome 
has zero disorder. 

(iii) Ent k+ ..., p k , 0) = Ent k (p u ..., p k ). 

This corresponds to the idea that if we replace our system M = (e 1; .,., e k } with the 
system M' - {e L ,...,e k ,e k+1 } but only consider states in which e k+1 is impossible, 
then the entropy is unchanged. In other words, throwing in a fake alternative does 
not change the lack of information. 


(iv) EntdPi Vi) < Entl 

4 i 


\r / 4 * * * irk/ ^ * | 

{k ’ k) 



with strict inequality if p t # \/k for some i. 

This says that the lack of information is a maximum when all alternatives are 
equally likely. For the proof of this assertion, observe that for all x ^ 0 we have 

xlogx ^ x — 1 

with equalityTiordlng onry at x = 1, as can be verified by co^ 
of both sides. For x ^ 0 we can write this as 

— logx <x -1 — 1. 


lnus 


-p;logPi-(-PiI°gjr)= ~Pi 


< ~Pr 


JT 1 

Pi 


1 


Summing over i the right side vanishes since £(l//c) = 1 =2P/t- Thus 


4 


4 


1 . 4 


-Ep» lo gPi< -ZPi lo g7 = - lo gir = -Zi lo H- 


Notice that we can think of Ent(p l5 ...,p k ) as representing our ignorance before 
performing the experiment which tells us which of the k alternatives actually 
occurred. In this sense, Ent (p u ..., p k ) represents the amount of information obtained 
by performing the experiment. 

(v) Su p p ose that M = M x x M 2 where M t consists of k elements {e u ...,e k } and 
M 2 consists of l elements, so that M consists of th e kl e l e ments {(£,,/, • )}• Suppose that 




P = Pi X p 2 SO that \{(e b fj)}) = Pi q.. Then 

Ent k ,(p) = - ^Piq^ogiPiQ]) 

= -ZP^flQg Pi + log g;-) 

= - X>« logpi “ lo § 

So sinc e 1>; = Sgj = J » 

Ent fc ,(p! x p 2 ) = Ent^Pi) + Ent,(p 2 ). 

In other words, if we conduct two independent experiments the total amount of 
information gained is the sum of the information gained in each. 

It turns out that properties (i)-(iv), together with a slightly stronger version of (v), 




space, there is essentially only one way of measuring the ‘disorder’ of a state if this 




system. 

For any system (M. j/. p) and any state p we define 

/* 

Ent(p) = — p log pdp 
provided that the integral converges. 


(22.42) 


22.10. Equilibrium in statistical systems 

We now have a measure of ‘disorder’ for any state. Now suppose we are given a 
system and a sequence J l5 ..., J„ of real valued observables. We can observe the 
expected values of these observables in any state. We collect the observables into a 
vector-valued observable J = (,..., J n ). Using the maximum entropy principle as 
motivation, we consider a state of ‘statistical equilibrium’ to be a state which 
maximizes ‘disorder’ subject to our knowledge of the expected value of J. We can 
now pose a precise mathematical problem. 





hypotheses which will be discussed below, this problem has a unique solution which 
we now proceed to desc r ib e . 

Let V* denote the dual space to V. Thus an element ye V* is a linear functional on 
V. We denote the value of y at the vector v by y v. For any ye V* and meM we can 
form yJ(m) which is a number depending on y and m. Thus y-J is a numerical 
function on M. We can thus form the integral 


F(y) = 


/• J 


M 


p- 


Thus 0 < F(y) € + go (the integral may diverge to 4- on). Suppose that 
F(y) is finite. Then 


,-yJ 


(22.43) 


Fill 


defines a state of the system. Indeed p y is a positive function and by the definition of 
F{y) we have \p Y p = 1. For each value o f y w here F(y ) < + o o we get a state p r 
We thus have a collection of states, parametrized by a subset of V*. Notice that 
if F(y) < + oo then Ent(p y ) is finite. Indeed 


p Y \ogp Y fi 


py(-\ogF(y)- y-J)p. 


Now F(y) is a constant, when considered as a function of an M. Therefore 


fp y logF(r)M = logF(r). 


Furthermore, 



^ T u T 7 ( T. n \ 

«/ 

Py " P L(J, p y) 


is just the expectation of the vector-valued observable J in the state p r Therefore we 
can write 


p Y yJp = y-E(J;py). 


Thus 

Ent(p y ) = log F(y) + y E(J;p y ). -(22.44) 

Let us take into account that the function J may also depend on some 
‘configurational’ parameters, Q. If we write S(y, Q) = Ent(p y , Q) and Z(y,Q) = 
log F(y, Q) then (22.44) becomes 

S = Z + yJ(y,Q). 

In the special case that J = H this is exactly (22.24). But (22.24) is the combined 
first and second law of thermodynamics and we derived. 

Suppos e that for a giv e n valu e of J e V w e can find a y e V* such that 

-'- _ - 

JpyP = E(J;p r ) = J. 

(In particular we are assuming that jJp y p converges absolutely.) 
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We claim that the state p is then the unique solution to the maximization problem. 
In other words, we have the following theorem. 




qui i mum 


(( 


Notice also that inequality (20.46) is strict if p(m)/pfm) # 1. Thus if p(m) # pftn) for 
a set of positive measure we must have Ent(p) < Ent(p y ). This completes the proof of 
the theorem. 

As a consequence of the above theorem we now give the (statistical) definition of 
an equilibrium stat e . 


Definition: Let J be an observable on ( M, stf, p). If y is such that F(y) = je r J p < oo 
then 


*\Yj 


Py = -F7^ e 


yj 


is called an equilibrium state of the system (relative to the observable J). 

We must show that (under suitable hypotheses) for any value of J there exists a 
state p y such that £(J; p y ) = J. The proof of our theorem shows that if such a p Y exists 
it must be unique. We defer this question to a later section. We first give various 
examples of the notion of equilibrium state, using the systems introduced in those 
examples of section 22.6. 

Al. Let M = [e u ..,e k } and p be as in example A of section 22.6. Let V — IR be one- 
dimensional and let J:M-» U be given by J(e £ ) = s ; where the are real numbers 
(which we may think of as ‘energy levels’). With no loss of generality we may assume 
the labelling of the element s of M i s s uch th a t - 




Then V* = U and for any jSeF* we have 










(This last equation represents a general rule, as we have seen.) Then 

_3£(J 

_ 8 p £e-»- \ Se-'- ) 

~ fll JII 2 P/?A i — 


= J(J-J) 2 p^0 

with strict inequality unless all the s f are equal. (If all the £, are equal there is only one 
possible value for J, namely the common value of all the e b and all the p p are the 


then £(J; p p ) is a strictly decreasing lunction ol p. It is clear that 


Thus, any value of J such that < J < e k can be achieved by a unique choice of j 8. 


Ent (p fi ) = - pfed log p p (e { ) 


= _v 

v 

=p-j I 1 ( 

This is of course a special case of (22.24). 




log(e — log 


+ log F(/?) 


X e’*') " 
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convenience, we label the elements as e 0 ,..., e k so that there are k + 1 elements in M. 






In this case / = (y 1? ..., y k ) and 


clearly converges for all values of y with 


^ 'FCti * F(yY 

If we denote pje t ) by p £ then it is clear that any point p = (p l5 ..., p k ) in the interior of 
the simplex can be achieved by a suitable choice of the y f . In this case all states 
corresponding to interior points p are equilibrium states. 
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The reader can check that this map is one-to-one with a differentiable inverse. 

B. Let (IR, j/. fi) be the real line with its standard measure. If we take J to be the 
identity map J(x) = % t he n the corresponding Z(y) = J^e^d* diverges for all 
values of y. However, let us consider the map J:lR-> IR 2 given by J(x) = (x, x 2 ). Then 
for all y= (y l5 y 2 ) with y 2 > 0 the function 



r°° 

F(y) = 

Q-yix-y 2 x 2 

_& 

' — 00 


converges. The corresponding equilibrium state 

_ p _ F ( y)~ 1 e y i*^* 2 = F (y )~ Vi /4y 2 e -(* + n/2y2)/ 2 (V2y 2 ) _ 

is clearly a normal density with expectation m = y 1 /2y 1 and variance o = (2y 2 ) _ 1/2 . 
It is clear that for any a > 0 and any m we can find y = ( y 1; y 2 ) such that 


P r (x) 


_g “ (X - m) 2 /2(T 2 

g V (2n ) — 


Thus among all random variables with derivatives and having a given expectation 
and variance, the normal distributions maximize the entropy. To compute the 
entropy we observe that by translational invariance we may as well assume that 
m = 0. Then 


T7« + /',»-V ^ 


1 x 2 


Jj/Ht \P yj v 


2 logX7r+ io grr + 



= log a + j log 2n + 1. 


Notice that the entropy tends to — oo as a -*■ 0. This corresponds to the fact as a -*• 0 
the p is more and more concentrated about a point and an ‘infinite amount of 
information is required to pick a point out of a continuum’. 

C. Let us consider the observables J l =x and J 2 = \ogx on 1R + . Then 
r= (yi,y 2 ) >J = (A^ 2 ) and 



” 00 

f(y) = 


«/ 

10 



f* oo 

— 

e~ yi:iC x~ y2 dx 


0 


converges for > 0 and y 2 < — 1. 

If we set k = — y 2 + 1 and y = y t x the integral becomes 





e~'^ x x k ~ 




_ 

% CO 

_— y_fe — 1 1 _ ._ 

= 7i 

e x d y 

'o 


= yr*T(/c). 



The corresponding distributions are called the gamma distributions. The density 



We want to compute the vector valued expectation 



The evaluation of E(J 2 ) is left as an exercise. 



e -¥k 


converges for all values of /?. We set 







a system is a measure space (M, si, g). 




a state is a non-negative function p on M such that \pp = 1, 

the entropy of a state is EnL(p) — — J/^logp/i, 

an observable is a vector-valued function on M, 

t he expect ati on of an observabl e K i n a sta te p i s E(K;p) = jKpp. 

If J is a particular F-valued observable the partition function associated to J is the 
function defined fo r ye V* by 


*■(/) = 



This is defined on the subset C of V* where this integral converges. 

The Massieu function Z is defined on C by Z = log F. The equilibrium states 
relative to J are the states of the form 

Py=F(.Y)- 1 Z ~ r3 

for yeC. Whenever / lies in the interior of C we can compute the expectation of J in 
the equilibrium state p y from the Massieu function by the formula 

- Ji=- dz/d Yi - 

which we may write as 

J = — dZ/dy 

for short. Then 

S(y) = Ent (p) = Z + y J. 

In particular, we can determine y from J by the formula 

y=dSJdJT 

(Here we are thinking of S as a function of J by considering y as a function of J by 
inverting the given function J = J(/).) The observable J may depend on some 
auxiliary variables, Q 1} ..., Q d . That is J may be a function of M x B where B is some 
space with coordinates Q 1} ..., Q d . Then the function F and the equilibrium states p Y 
will also depend on these auxiliary parameters. 

If one of the components of J is called H - the ‘energy’ - then the internal energy 
of a state p is defined to be E(H\ p). With these definitions, and the discussion of work 
that we gave in section 22.8 (cf. equation (22.40)) we now have all the ingredients to 
pass from the ‘microscopic’ model to the ‘macroscopic’ observed phenomena. 


22.11 Quantum and classical gases 

Let us begin by comparing the partition functions for the three systems, D, E, and F 
in section 22.8. For each of these systems we shall consider a two-dimensional 
observable J — (N,H) as before, where N is to be thought of as the ‘number of 
occupancy’ and H as the ‘energy level’, and where the functions N and H are related 

H = Ne 









Table 22.2 


System /u(k) 


MO) = md = i 

li(k) = Q if k> 

T 


Partition function 

exp [e~ (gl + ^ 2£) ] — exp [e (M ~ £)/r _ 

1 _|_ g-(/?i+/J2£) _ J g (n~e)/T 

_1_1_ 

]_ Q~(Px+p2e) j_g(/< —£ )/^r 


£(iV) 

e 0t -e)/r _ 

1 

1 + g-(n-«)/T 

_L- 

e -Ui-e)/T __ £ 


where e is thought as the ‘energy of the single state occupancy’. We have already 
done the computation for case D in the preceding section, but let us now do them all 
at once in the form of a table: Table 22.2. 



for a ‘dilute system’ the Boltzmann, Fermi Dirac, and Bose Einstein probability 
assignments will be quite close. That is, the terms in the r i ght hand col umn of 




.1 i 1 
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‘concentration’ they can differ markedly. Figure 22.17 is a graph of the three 
functions e~\ l/(e f + 1) and l/(e r — 1). Notice that for t > 2 the curves are practically 
identical, but for small values of t (corresponding to high concentrations) they 
diverge from one another considerably. 

Let us now consider the situation where we have d copies of one of these systems, 
each with a different value of the single occupancy energy, so that we have d values 


M = M l x ••• x M d . 


System Name 

D Boltzmann-Poisson 


Prob (N = k) 
k /k\ 













We assume 


and define 



Figure 22.16 Plot of the Boltzmann, Fermi-Dirac and Bose-Einstein 
expectation values for the number of particles as a function of (n — e)/T 


H = H 1 +-+H d . 


Then the partition function for the total system is just the product of the partition 
functions for each component system: 

F = F t x ••• x F d 


Fi = F ei (^J 2 ) 

where we have to plug in g ; for s in our expression for the partition function from 
Table 22.2. 

Let us say once more what this means. We have d systems. We have put them 
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equilibrium states for the ith subsystem would be determined by its own two 
parameters, and /? 2t -. The fact that the systems are all in equilibrium me a ns that 
we have a common value of ^ and fi 2 for all of them. We have recovered Gibbs’ 
criterion (22.26). 

So f a r we h a ve been very a bstract and not chosen any physic a l descript i on of our 
component systems. Now consider the following model of ‘gas in a box’ (different 






from the approach we took before). Consider a subregion in phase space of the form 
B x □ where B <= IR 3 is our box and □ c IR 3 is a little cube (say) centered about 
some point p in momentum space. To say that a gas molecule lies in our region 
means that it is in the box and that its momentum is close to p. The occupation 
number for this subsystem tells us how ma ny particles lie in the bo x and have 
momentum close to p. The subsystems ‘interact’ by collisions, that is by interchang- 
ing momenta. Let us assume that the energy of a single particle in the subregion is its 
kinetic energy. Thus 

_ e(P) = |IIPll 2 = f(P*+Pv +Pz) _ 

is the ‘single occupancy’ energy for the subsystem centered at p. Thus the 
Boltzmann-Poisson prescription says that ‘expected number of particles in our 
subsystem’ is proportional to 

exp[(p-il|p|| 2 )/T]. 


This is just the Maxw e ll-Boltzmann distribution, wher e the additional parameter /i 
adjusts for the density of the gas. Of course we have passed over two steps of a 
technical nature - allowing for an infinite number of subsystems to fill up all of phase 
space and letting the size of the phase space regions shrink to zero. Neither of these 
presents great difficulties. 

But we could apply the same idea to the Fermi-Dirac or to the Bose-Einstein 
statistics. Thus for Fermi-Dirac the Maxwell-Boltzman distribution would be 
replaced by 


m= 


expL(j||p|| 2 -p)/TJ + T 


with a similar replacement for the Bose-Einstein case. 


22.12. Determinants and traces 

W e will now express the formula 

F(PuP2) = Fi{fiuPi) x F 2 (Pi,P 2 ) x-xF^t,/4) - 

(for each of our three cases) in a fashion that looks weird, but has profound 
consequences. Let V = IR d and let X be a diagonal matrix whose ith entry is e t . 

Let us first consider the Boltzmann-Poisson case. Then since the product of 
exponentials is the exponential of a sum we have 

—F(P i, p 2 ) = exp [e " (/?1+ * 2ai) + • • • e ~ (/?1 + ***] 

= exp (tr e _ (/?l/+filX) ) 

= Det (exp e " 

since, for any matrix we have Det (exp A ) = exp (tr ^4). 

Next let us look at the Fermi-Dirac case. Then since the determinant of any 
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Table 22.4 

System 

Boltzmann 

Fermi-Dirac 

Bose-Einstein 


Partition function 

Det(expe“ (/il/+/,2X) ) 
Det(/ + e 
Detf I-e 


diagonal matrix is just the product of its diagonal entries, 



f F^fi,) 0 - 0 ^ 


= det 

f 2 W M - 0“ 

(22.48) 


^-0- FAB„B,), 



and so 


trix on the right of 122.48) is the same as the matrix 

/_l_ e (Pii+fax) 
f _i_ 


Similarly, in the Bose-Einstein case the corresponding equation is 

F = Pet [7 — e~ (/?lf+/?2X) ] 1 . 

In other words, we get Table 22.4. 

If we compare Table 22.2 with Table 22.4 we see that the change consists of 
replacing the scalar ^ + f} 2 e by the matrix fij + fi 2 x and taking the determinant. 

We will now express the process of taking the determinant in a slightly different 
fashion. Before doing so we make a number of preliminary remarks about linear 
algebra. Let Fbe a vector space and A a linear transformation of V Then A induces a 
linear transformation A® A of V® V determined by 

(A (g) A)(u ®v) = Au® Av. 

If A = e' y then differentiating the equation 

e tY (x) e tY (u (x) v) = e tY u (g) e tY v 
with respect to t at t = 0 gives 

(d/df)[e ty (x)e fy (u®i;)]| t= o = Yu® v + u® Yv 


This shows that e fy (g)e ty = exprZ where Z is the operator 

Y®I + 1®Y. 

In other words, 

Similarly, if we consider the induced action on k®F®Fwe obtain 
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with a similar expression for F® F® F® F and so on. We shall use the general 
expression for the matrix that occurs in the exponential on the right hand side as 
D k (Y). Thus 

D 2 (Y)= 7®/ + /®Y on F®F, 

Di(Y) = y®f®/ + /®y®/+/®/®7 on F®F®F, 
and so on. Thus, for example, setting r = 1, on F®F®Fwe would get 

p*" rvl p^ — p.D 3 (Y) 


Taking the trace of the left-hand side of this equation gives (tre y ) 3 , since 

_ __ ... . v 1 . • -t 1 i 


- — w — ^ * 

(tre y ) 3 = tre D3(r) . 




so that 

(tr Yf = tre Dk(Y) 

for all k. 

We can thus write the partition function for the Boltzmann composite system as 

_D et(expe~ (/?l Jr+/?2 y) ) = ex p( tre~ (/? ' / +/?2X) ) _ 

= E(tr[e~ (/?lJ+/?2X) ])Vk! 

-- £(l/fc!) t r {exp(D t [ -(/>./ + fi 2 xm - 

by taking Y = e' iPlI+l}2X) . We shall write this formula in even more compact form. 


I I-J.q w ki*j ■ »j «anwm rnmn m »ikhi >»:»j 


T 0 (V) = K, 

Ti(V)=V, 

T 2 (V ) = F® F, 

73 (F) = F® F® F, etc. 

Let us now consider the full tensor algebra 

T(V) = T 0 (V ) ® T l (V) ® T 2 (V) 6 ... 

(This is an infinite-dimensional space, but not to worry.) Any matrix on this space 


0 

0 

^01 

^02 

A ■ • A 

^03 1 

^10 

■^11 

^12 

■^13 

^20 

^21 

^22 

^23 

1-^30 

^31 

^32 

^33 '••/ 


where A nn is a transformation of TAV ) de. a scalar). An is a linear transformation of 




eterminan s aru races 


T i(V), A 22 is a linear transformation of T 2 {V) and so on. Now define 




(D 0 (Y) 0 0 0 0 -A 


D(Y) = 

U U U 0 • •• 

0 0 D 2 (Y ) 0 0 ••• 



U U 0 0 * * * 

r~ i 



Here we have set D 0 (Y) — 0 and D^Y) = Y. We then have 

tr®e D(y) = £(l/fc!)trexp {D k (Y)} 

so 

F(fi 1 ,P 2 ) = tr®e' D(lilI+l>2X) . 

We now turn to an analogous construction for the Fermi-Dirac situation. Instead of 
T(V), the tensor algebra, let us consider A (V), the exterior algebra. For example, 
let us look at A 2 (V) = V a V, the space of exterior two-vectors over V. This time 
define D 2 (Y) as a linear transformation on A 2 (V) by _ 

D 2 (Y)(u av)=Yuav + uaYu. 

It is the same formula as before with (x) replaced by a . As before we have 

e y a e y = e° 2(r) on A 2 (V). 

In the same way we define 

D 3 (Y)=YaIaI + IaYaI + IaIaV on A 3 V 


RTS I 




it 
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the single matrix on A {V) whose diagonal components are the D^Y). Of course, this 
is now a finite matrix (if V is a finite-dimensional vector space). 

Suppose that Z is a (diagonalizable) matrix with eigenvalues z u ...,z d and 



direct sum operator on A(F) = R®F©(FaF)®...,' 






lermcx ynamics 



(I 0 0 0 o 

0 Z 0 0 0 ••• 



0 0 ZaZ 0 o ••• 


Z A = 

0 0 0 ZaZaZ 0 ••• 



. ••• ... . 



V . . J 



then 

trz A = i + 5>i+I>i z j + Ew* + ••• 

=na+*i)=Det (j+z). 



which is just the direct sum of all the formulas 



e yA e y = e Dl(Y) on A 2 (F), etc. 


We obtain 

tr A e° (y) = det(l +e y ) 


where we have used a boldface tr—on the left of this equation to emphasize that we 
are computing the trace over A 2 (F). If we take Y = q~^ iI+ ^ 2X) we get 

det(/ + e ' (PlI+p2X) ) - tr A 

Let us summarize the strange mathematical manipulations that we have been 


I ■ ivj ■ i i ■ ■ r! m a ij riii n'l'a ■ ■ ■ mi bCii 
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with ‘energies’ e 1 ,...,e d . The corresponding partition function is 


If we let X be the diagonal matrix with eigenvalues e l5 ..., e d then we can write this 


Z(/3) = tre' px 


matrix. We now want to consider a more complicated system where each of these 
‘energy levels’ can be occupied by a varying number of ‘occupants’. We now have 
two equilibrium observables - the total number of occupants and the total energy. 
The partition function now depends on two variables and the passage from the 



replace 



V by A (F), then Z(/? l5 /? 2 ) = tr A e Z)[/Jl/+/J2 * 3 (Fermi-Dirac). 








uan 


Table 22.5 

Slvgtpm 

_Vpotnr gr%ar.P,_ 

Partition function 


Simple 

V 

exp (— fiX) 

Boltzmann-Poisson 

T(V ) 

tr®e ® 1,+fcA1 = Det(expe -[/!lI+/j2A ' ] ) 

Fermi-Dirac 

MV) 

tr * e -iw.J+#2fl = Det (I + e _Wl/+/i2X] ) 

Bose-Einstein 

S(V) 

jj,symg — DfiSf/ + /?2-Xl — Dctf/ c _ r0i/ + 02.Xh- 1 


-Poi s son a nd Fermi-Dirac consists of using 
the exterior algebra, A ( V) instead of the full tensor algebra, T(V). For the Bose- 
Einstein case, we replace the exterior algebra by the symmetric algebra 5(F) (the 
algebra of all polynomials on V*) . 
case except that now we will have 


tr Sym Z = Pet (I - Z) 


-1 


We summarize our results in Table 22.5. 


22. 13. Qu ant um states and qua ntum logic 


Let us review the construction of the preceding section in the simplest case. We 
consider a system which is a finite set and for which p assigns equal weight' to all 
points. In other words system A of section 22.6. ‘Integration’ is just summation. An 
observable is a function on this finite set and a state is a non-negative function which 
sums up to 1. But we decide to write functions as diagonal matrices, so that sums 
become traces. Thus a state p is written as 


rm o o tr 


P = 

0 p(2) 0 0 ••• 


r 

0 0 p(3) 0 ••• 



V., 


and an observable J is written as 


^J(l) 0 0 0 -A 

n tpi o o 


J = 

0 0 J(3) 0 ••• 

i 



so that 

T» — Vr T r\ 

122.501 

H) — V-' 

The partition function 

was given as 

F{fi) = tVQ~ PX 

(22.51) 





where 



and then the equilibrium states are given by 

/ e' pei (L 



or 

(22.51) 

The passage from the simple case to the occupancy case involved replacing the finite- 

dimensional vector space V by one of the larger spaces T(V), A (V) or S(K) 
depending on what kind of ‘statistics’ we wished to consider. In all of this, the only 
matrices we considered were diagonal matrices, so the use of matrices altogether was 
very artificial. But now let us take this matrix picture seriously. 

Let Vbe a vector space with a scalar product. (For important physical reasons we 
wiTl want V to be a com plex vector space andT7) to be a pos itive definite sc alar 
product, so that V is a Hilbert space. In actual applications one frequently will also 
want V to b e infinite-dimension al. But fo r understanding the key i deas we do not 
need to get involved with the technical problems of Hilbert space theory and can 
illustrate the concepts in a finite-dimensional setting.) Recall that an operator A on V 
is called self-adjoint if 

(An, v) = (u, Ay) for all u, v e V 


The arguments of section 4.3 (done in the complex vector space setting) show that 
any self-adjoint can be diagonalized - that is, that there is an orthonormal basis 
•, e d of V such that 


where the 2,- are real numbers. (In the infinite-dimensional setting the corresponding 
theorem, suitably modified, is known as the spectral theorem.) Let us say that A is 
non-negative, or A ^ 0, if all the are ^ 0. What amounts to the same thing, A is 
non-negative if and only if 

(An, u) ^ 0 for all u e V 

as can be seen by writing u as a linear combination of th e e t . W e can now mak e the 
following definitions. 

A quantum system is a complex Hilbert space, V. 

A quantum (statistical) state is a non-negative self-adjoint operator, p, such that 


trp = 1. 



uan i • ( 


The entropy of a state p is given by 

Ent (p) = — tr p log p. 

A quantum observable is a self-adjoint operator. 


The expectation of the observable A in the state p is given by 

E (A; p ) = tvAp. 

Let J u _, J k be k commuting observables. That is, assume that 

JiJj = JjJi for all i and j. 


Let /? = (/?!,..., where the are real numbers, and define /?J to be the observable 
_ /?J = /? ! »/1 + —I - PkJk- 

Then the corresponding partition function is defined by 

F(P) = tre -/?J 


(22.52) 


and the equilibrium states p p are defined by 


p^FW-'e-*. 

Thus the expectation of an observable A in the equilibrium state corresponding to /? 
is given by 

E(A-,p p ) = F(f)- l trAe- pJ . (22.53) 

^Equations (22.52) and (22.53) represent the basic theoretical content of quantum 
statistical mechanics. To quote from page 1 of Feynman’s book Statistical 

entire subject is either the slide-down from this summit, as the principle is applied to 
various cases, or the climb-up to where the fundamental law is derived... ’. 

In the case when V is infinite - dimensional, the trace of an operator becomes an 
infinite series, which may not converge. So not all operators have traces. 
Correspondingly, the function F(f) will only converge for a given range of /?. We will 
leave this question aside and concentrate on the case where V is finite-dimensional 
as a model. 

The basic difference between the quantum system and the classical system is thatT 
in the classical system an observable was a function (on a finite set) which we chose to 
regard as a diagonal matrix while in the quantum system we allow all (self-adjoint) 
matrices. This might appear, at first glance, as a technical modification (one that has 
been amply verified by experiment over the past sixty years). In fact, it represents the 
most profound revolution in the history of science b e caus e it modifies the 
elementary rules of logic. Let us explain. 

We begin with a classical system again. Let us define a ‘ves or no observable’ to be 
a func tion, /, that can take on only the values 0 and 1. This corresponds to any 
experiment in which the final answer is given by a certain indicator (say a light or a 
click) being on or off. To say that a function f takes only the values 0 and 1 is the 
same as to say that 

/( m ) 2 =/( m ) 





for all m, or, more succinctly, that, 


Every such function corresponds to a subset, B a M, the set where /= 1. So given 
any subset b we get a function, f B , where 

f B (m) = 1 if meB and f B (m) = 0 if m$B. 

If / and g are two ‘ yes or no observables’ then, in a classical system, 

fg is again a ‘yes or no observable’. (22.55) 


/c/n — fc 


(22.56) 


sine e fc(m)f D {m) = 1 if and only if both / c (m) = 1 and f D (m) = 1. Thus multiplication 
of functions corresponds to intersection of subsets, i.e. to logical conjunction - a 


■i ■* • <* * . 1 1 


and D are disjoint, i.e., 


If C and D are disjoint then 

Fc + F n = F riin . (22.57) 

since both sides take on the value 1 when the argument belongs either to C or to D 
(and no point belongs to both). The distributive law for multiplication 

- fB(fc + f D ) = fBfc+fBfD - (22.58) 

is j ust a translat i on of the distribut ion law i n set theory 

Bn(CuD) = (BnC)u(BnD). (22.59) 

This, in turn, is just a version of the distributive law in logic. If we let B denote the 
assertion that meB etc., then n denotes conjunction: B n C is the assertion that both 
B and C are true. Similarly u denotes the (inclusive) or: CuD means that either C or 
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saying ‘or both’.) Thus the distributive law above just reflects the elementary 


To say that both B and (either C and D) are true is the same as saying that 
either (B and C are both true) or (B and D are both true). 

Let us now examine the corresponding situation in a quantum system. Our 
observables are no longer functions but self-adjoint operators. We can not talk 


n« * i■ ij ■ > m ■ j ■ i i i ■viv>i<i i'r< |ij iv 
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call a self-adjoint operator, n, a ‘yes or no observable’ if it has only 0 or 1 as 
eigenvalues. This will clearly happen if and only if 









subspace spanned by the eigenvectors corresponding to the eigenvalue 1. Thus each 
yes or no question corresponds to a subspace (instead of a subset) and we can write 
7i a f°r orthogonal projection onto the subspace A. Notice that now the zero operator 
corresponds to orthogonal projection onto the zero subspace, {0}, and 

n c n D = 0 if and only if CnD = { 0}, (22.61) 

s o the zero subspace, (0), pl a y s a 
7z c 7 i D = 0 then, taking adjoints, we have 


set . Notice also that if 


0 = (n c n D )* - %l%% = n D n c 


so n D n c = 0 and 


(n c + n D ) 2 = nl + nl = n c + n D 


since the cross terms vanish. Thus n c + n D is again a yes or no observable. It is clearly 
orthogonal projection onto the direct sum, C© D, of the orthogonal spaces C and D. 
Thus 


7T C + 7tj) — 7t C Q )D if 7l c 7t D — 0 (22.62) 

in complete analogy with (22.57) where © replaces u. What fails drastically is 
(22.55). Given two subspaces C and D, it is not true, in general, that n c n D = n D n c and 
therefore the operator n c n p is not self-adjoint and hence is not an observable. For 
example, if C and D are two non-orthogonal lines in the plane, then iz c n D ^ 0. But the 
image of n c n D is C, while the image of n D n c is D. So n c n D =£ n D n c . The problem stems 
from the noncommutativity of mat rix mu ltipli cation. Noti ce that the anal ogue of 
the distributive law in set theory - the would-be assertion that 

B n (C ®D) — (Bn C) © (B n D) - is not true in general. 

Indeed, if we take C to be the x-axis and D to be the y-axis in the plane, then C © D is 
the whole plane and therefore B n (C © D) = B for any line B in the plane. But if B is 
any line other than the x or y axis, then BnC = BnD = { 0} so(6nC)©(SnD) 
= {0}©{0} = {0}, and the two sides are not equal. 

Thus the distributive law does not hold in quantum logic. As we mentioned above, 
the validity of quantum mechanics has been experimentally demonstrated over and 
over again during the past sixty years. So experiment has shown that one must 
abandon one of the most cherished principles of logic when dealing with quantum 
observables. 


Summary 


A Caratheodory’s formulation of the Second Law 

You should be able to explain under what circumstances there exist points near a 
point P which cannot be joined to P by null curves of a one-form a. 

You should be able to state and explain Caratheodory’s formula of the Second 
Law. ' 

You should be able to explain the concepts of absolute temperature and entropy 










in terms of differential forms and line integrals and to describe how these quantities 
may be computed from empirical data. 


Exercises 


22.1. Let a = (y 3 + y)dx + (xy 2 + x)dy. Characterize the set of points in R 2 that 

4L 


can be joined to ^ J by a null curve. 

22.2. Let a = ydx + xdz. Find a null curve of a that joins the origin to the point 

- ; - 

b 1, where b> 0 and c > 0. With the exception of the origin, the curve 

c' 

should not pass through any points where x, y, or z equals zero. 

22.3. Let a = xdy. Show that any two points in R 3 can be joined by a null curve 
of a, even though « a d« = 0. Show that for the form /? = (!+ x 2 )dy, two 
points with different y coordinates cannot be joined b y a null curve. 

22.4. Write to form a = 2 y e x dx as a = fdg and as a = FdQ in such a way that F 
is not simply a constant multiple ^t~ 

22.5. Consider one mole of a monatomic ideal gas, for which pV=RT and 
U = jpV. Suppose the gas in initially in the state p = 32, V— 1 and expands 


absorbed by the gas for each of the following processes: 

(a) First p is reduced to 1 by cooling the gas at constant volume, then V is 
to 8 at constant pressure. 


(b) First t he gas expands isothermally to L^ 8, thenp is reduced Ur LaL 
fixed volume. 

(c) Throughout the process, pV 5/3 = 32. 


22.6. For the monatomic ideal gas of exercise 22.5, write the heat form as 
a = TdS. Thereby determine S (up to an additive constant) as a function 
of p and V. Confirm that dT a dS = dp a dV. 

22.7. For a container of volume V filled only with electromagnetic radiation at 
temperature T, the pressure is p = ^CT 4 , while the energy is U = 3pV 
(Here C is constant.) 

(a) By expressing the heat form dU +pdFas a — TdS, find an expression 
for S as a function of p and V. 

(b) Suppose that initially the volume of the container is V 0 , the 
temperature T 0 . The volume is now increased adiabatically to 64 F 0 . 
Determin e th e final t e mp e ratur e and th e work p e rform e d during th e 
expansion. 

22.8. (a) In a Carnot engine, heat is absorbed at temperature 4T 0 and 

exhausted at temperature T 0 . What maximum fraction of the heat 
absorbed can be delivered as useful work by the engine? 

(b) Suppose this same engine were operated backwards as a refrigerator. 
If heat Q is absorbed at the lower temperature T 0 , how much work 
must be performed on the refrigerator? 

22.9. For a system of N protons, each of magnetic dipole moment p, placed in an 
external megnetic field H, it is reasonable to regard H as the only 
configuration variable. The magnetization M of the protons, and the 
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associated energy U, are well approximated, in the limit of large T, by 


M — 


NjFJT 

kT ’ 


NjfiH) 2 

kT 


The work done by the system is given by W = jMdH. The heat absorbed is 
of course AU + W. Find an expression for entropy S as a function of T and 
H. 


22.10. Consider a system for which the volume V is the only configurational 
variable. Prove the following relationships: 


(a) i 

(dT\ 



\dVJ adiabatic 

/ f)S \ 

\3S / fi xe d volume 

/a n \ 

(b) 


' up 


J isothermal 

\dTJ fixed volume 


(Hint: make use of dT a dS = dp a dV) 


22.11. The ‘Helmholtz free energy’ F of a system is defined in terms of the internal 
energy U as F —U — TS. 

(a) Show that if a syst e m int e racts isoth e rmally with its surroundings, the 
increase in F equals the work done on the system. 

(b) Suppose F is expressed as a function of V and T. Determine its partial 
derivatives with respect to these two variables. 

(c) Determine the function F(V,T) for the ideal monatomic gas of exercise 
22.5. 


22.12. Let K = -denote isothermal compressibility 

VdT a dp 


and a 


dV a dp 
VdT a dp 


denote coefficient of thermal expansion at constant 


pressure. 


P r ove the relation C p = C v + VT 



22.13. For the monatomic gas of e xercis e 22.5, find an e xplicit expression for 
the function Z(/?, V). Evaluate its partial derivatives and show that they 
equal — U and ftp respectively. 

22.14. Suppose that entropy S of a system is a function only of its energy U, in 
accordance with the formula S = N(U/U 0 ) l/2 . Find expressions for the 
heat capacities of this system as a function of temperature. 

22.15. Consider a system that consists of three particles. Two of them are 
indistinguishable bosons, while the third is distinguishable from the other 
two. Available to each particle are three states, all of the same energy. 

(a) What is the probability that all three particles occupy the same state? 

(b) What is the probability that each particle occupies a diff e r e nt state? 

22.16. For a classical magnetic dipole of moment fi in a magnetic field H, the 
energy is U = —//Hcos0, where 6 is the angle between the vectors. 



By integrating over all possible orientations, evaluate the partition 
function for this system. 

22.17. Consider a system with three states whose energies are — e,0, and e 
respectively. Suppose the expected value of the energy is — fs. Determine 
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fsau 


and one state of energy £. If the expected value of the energy is — je, what 
are the probabilities of the various states? 

Consider a system with just two states, of energy —e and e respectively. 
Write down its partition function and use it to determine the energy U and 
entropy S as functions of temperature T. 
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integer k we can consider the 2fc-form 

Q k = Q a Q a • • • a Q (k times). 

For each point x let k be that non-negative integer such that (Q k ) x #0 but 
(Q k+ X ) x = 0. This integer k will, in general, vary with x. We define the rank of Q 
at x to be the integer 2k. We will, in the main, be interested in forms of constant 
ra nk where 2k is independ ent of x. To -s ay tha t 2k = 0 mea ns that 0 = 0. Then 
form Q = dx a dy has rank 2 since Q ^ 0 while O 2 = 0. The form 

Q = dx a dy + du a da 

has rank 4 since 

Q 2 = dx A dy a du a da # 0 













to say that co has rank 2k if (dco) k is nowhere zero but 

co a (dcn) fc — 0 fdentically. 

If we apply d to this equation we see that then (deo) k +1 is identically zero, in other 
words dco has rank 2k. 

Examples 

The zero-form has rank zero._ 

dx 0 has rank 1 since d(dx 0 ) = 0 so k = 0. 

x x dx 2 has rank 2 in the re gi on x t # 0 since d(x : dx 2 ) — dx ^ A dx 2 has k = 1 and 
x t dx 2 a dx x a dx 2 = 0. 

dx n + x t dx 2 has rank 3 since d(dx Q + x t dx 2 ) = dx, a dx 2 has k = 1 while 
(dx 0 + X!dx 2 ) a dx t a dx 2 = dx 0 a dx t a dx 2 ^ 0. 

In general it is clear that - 

dx 0 + x 1 dx 2 + x 3 dx 4 H— + x 2k -idx 2k has rank 2k + 1 

while 

x 1 dx 2 + x 3 dx 4 H-1- x 2k -idx 2k has rank 2k 

in the region where not all the x 2fc _i vanish. 

We wish to prove, in this appendix, that these are the only examples: that if co is 
a linear differential form of constant rank 2k then we can always find local 
coordinates x 1 ,x 2 ,...,x 2t so that 

co = x x dx 2 + x 3 dx 4 + ••• +x 2fc _ 1 dx 2fc , 

whi l e if the rank of co is 2k +1 then we can always find coo rdin ates 
x 0 ,x 1 ,x 2 ,...,x 2fc , so that 

co = dx 0 + x!dx 2 + x 3 dx/| + ••• + x 2/c _ 1 dx 2fe . 

It was this fact that we used to prove Caratheodory’s theorem. 

2. Reduction to Darboux’s theorem 

If we believe the above theorem, then we can always introduce coordinates so that 

d co = dxj a dx 2 + dx 3 a dx 4 + • • • + dx 2fe _ x a dx 2fe . 

Darboux’s theorem, which we shall prove later on in this appendix, asserts that 
if U is any closed two-form of constant rank 2k, then we can always introduce 
coordinates so that 

Q = dx x a dx 2 + dx 3 a dx 4 + —f- dx 2fe _ t a dx 2fc . 

Let us assume Darboux’s theorem for the moment and derive the normal form 
theorem of the preceding section. If the rank of co is odd, there is no effort at all. 
Indeed, dco is a closed two-form of rank 2k, so we can write dco as above. Then 

dco = d(x t dx 2 + x 3 dx 4 H— + x 2fc _!dx 2fe ) 

or “ 

d(co — Xidx 2 “Fx 3 dx 4 -I- • • • -I- x 2 ^_ idx 2 ^) = 0: 
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co-x 1 dx 2 + x 3 dx 4 + ••• + x 2k - l dx 2k = dx 0 
by the Poincare lemma, where x 0 is some function. Thus 

co = dx 0 -t- x j dx 2 -t- x 3 dx 4 T ■ ■ ■ T x 2k _ ^ dx 2k . 

But 

co a (dcu) fe = dx 0 a dx t a dx 2 a ••• a dx 2ft -i a dx 2k 

js not equal to zero by assumption. Hence dx 0 is independent of the remaining 
dx f and so we can use x 0 , x l5 ... ,x 2k as part of a coordinate system. This completes 
the proof of the normal form theorem for linear differential forms of odd rank, 
assuming Darboux’s theorem about closed two-forms. For even rank we have to 
work a little harder. What we shall prove is that if co has (constant) rank 2k, then 


CO — fcT 

where cr has rank 2k —i. If we then apply the normal form theorem for forms of 
odd rank, we can write 

cr = dx 0 + WtdxT +-H w 2k _ 3 dx 2k _ 2 _ 

where x 0 ,w 1 ,x 2 ,w 3 etc. are coordinates. Now 
co = / d x 0 + /w 1 dx 2 H— 

= x t dx 2 4- x 3 dx 4 + *•• + x 2fc _ 3 dx 2k _ 2 + x 2fc _ 1 dx 2k 

if we set 


*i=/wi, * 3 = /w 3 ,...,x 2k _ 3 = /w 2k _ 3 , x 2k _! = / and x 2k = x 0 . 
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(dco) k # 0, which says that the exterior product of all the dx’s does not vanish, so 
that the dx’s are linearly independent at all points. So we must prove that co= fa 
where a has rank 2k — 1 and f is positive. For this purpose, we make use of 


Darboux’s theorem (still to be proved), which allows us to write 


where y l ,..., y n are suitable local coordinates. If we use these coordinates and write 


( 1 U X1 T u 2 uy 2 


then, since co a (dc») fe = 0, all the a t — 0 for i > 2k. The form of dco implies that the 


/ 2k+ 1> • • • On) 


would then get non-zero coefficients of dy f a d y Jt with i ^ 2k and j > 2k in the 




R 2fc . All other variables are irrelevant. 

Now _ 

CO A (dco) fe_1 

is a nonvanishing form of degree 2k — 1 on U 2k . Hence, at everv point in its domain 
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of definition it defines a line, namely the one dimensional space of all solutions of 

i(Q[(o a (dm)* -1 ] = 0 . 

Thus co a (dcu) fc_ 1 defines a ‘line element field 5 - in other words a system of ordinary 
differential equations. Let us choose a (2k — l)-dimensional surfa ce and use points 
on this surface as initial conditions. Then each point near th e hypersurface lies on 
a unique solution curve, and so corresponds to a definite time, t (the parameter 
along the curve), and a definite point (initial condition) on the hypersurface. 

If we introduce coordinates z u ..., z 2k _ 1 on the hypersurface then t,z 1 ,...,z 2k _ 1 
are local coordinates on U 2k . In terms of these coordinates £ = d/dt. In other words, 
the forms 

a>A(d«) k-1 and dz 1 a dz 2 a ••• a dz 2k _ 1 
define the same line element field. Thus 

co a (dm)*" 1 — gdz t a ••• a dz 2fc _ 1 

where g is a nowhere vanishing function. Replacing by —z 1 i f necessary allows 
us to assume that g is positive. Take / to satisfy 

f k = g • 

If we t hen define a b y co = f a then 

~~ dcd = d f /\ a + fdcx 
so 

co a da) = fa a da 

and, more generally, 

co a (dm)* -1 — f k <j a (dcr)* -1 = gdz x a dz 2 a ••• a dz 2fc _ 1 . 

Thus 

<J A (dcr^ -1 = dz t A dz 7 A ••• A d Z 7k _i. 

Taking d of both sides of this equation shows that 

(dcr) fc = 0. 

In other words, a is a form of rank 2k — 1. This completes the normal form theorem, 
assuming Darboux’s theorem. 

3. Proof of Darboux’s theorem 

Let us make some preliminary reductions for the proof of Darboux’s theorem. 
We are starting with a closed two-form fil of rank 2k on an n-dimensional space. 
Suppose that n > 2k. Then we can choose a vector field ^ so th a t i(£)Q vanishes 
identically. As before, we can then introduce coordinates (by solving a system of 
ordinary dif f erential equations) y 1; ..., y„_ 1; t so that £ = d/dt. The fact that i(£ )Q 
vanishes identically then means that Q does not involve dt. in its local expression 
in these coordinates, i.e. that 

Q = Z a u d yi A d yj 








But then 


dQ = J (da tj /dt)dt a d y t a dy,- 

and so the equation dQ = 0 implies that all the ( da^/dt ) = 0. Thus the a tj depend 
only on the j/s. In other words, Q is really a two-form defined on the (n — 1)- 
dimensional space of the ys. If n — 1 > 2k we can go through the same procedure 
to cut down an additional dimension. In other words, in the proof of Darboux’s 
theorem we may assume that 2k = n. 

We claim that this condition implies that the form Q is non-singular at every 
point in the sense that if 


W = o 

for a tangent vector g p at some point p, then £ p = 0. Indeed, the above equation 
clearly implies that 

i(£ p )[£2 a Q a ••• a Q] p — 0. 

(k times) 


But [Q a Q a • • • a Q] p is an n-form on an n-dimensional space, and the space of 
n-forms is one-dimensional. Since, by assumption, the n-form [QaOa • • • a # 0, 
we must have £ p = 0. (In fact, it will follow from the ensuing discussion that, on an 



condition that it be of rank n .) Another way of saying that Q is non-singular is 
simply to say that the n by n matrix _ 


K‘00) 

that occurs in the expression 

n = Za i jdyi,6y j 
is a non-singular matrix at all y. 

We will break the proof of Darboux’s theorem into two parts: 

(a) a differential geometric part which asserts that it is always possible to introduce 
a local change of coordinates so that in the new coordinates the matrix 


K(y)) 


is constant, i.e. that all the a^iy) are independent of y; 

(b) an algebraic part which asserts that we can make a further linear change of 
coordinates so that the constant matrix (a^) takes the form 


' 0 -1 
1 0 

_-0- 


0 



0 






1 0 


This is a theorem about antisymmetric bilinear forms on a vector space which we shall 


prove by induction on the dimension. 
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Suppose that V is a two-dimensional vector space and Q is a nondegenerate 
bilinear form on V. Let u be any non-zero vector in V Then the linear form 


is a non-zero linear form, by the nondegeneracy of Q,. Hence we can find a vector 


Q(u, v) = 1. 
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at time t is £ t (f t (x)). Furthermore f t ( 0) = 0 for all t. Now by the fundamental formula 



since 



Thus f t is our desired diffeomorphism. This completes the proof of Darboux’s 


imraram 




Caratheodory’s theorem. 






Further reading 


Once again, the list of books that we give at the end of this section is not meant as 



abstract) mathematical proofs. Similarly the Feynman Lectures once again provide 
a gene r al reference for the physics. 

The main algebraic content of volume II is the passage from linear algebra to 
multilinear algebra. The book by Greub is a clear and careful introduction to this 
Subject. In particul ar such subjects as the tensor algebra, the symmetric algebra, 
and the Clifford algebras, which we touch briefly, are given a leisurely and detailed 
treatment. 

Chapters 12 and 13 treat electrical network theory. The classical old school circuit 
theory text is Guillemin. The book by Lorrain, Corson, and Lorrain discusses the 
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delightful introduction to the probabilistic aspects of network theory and Kelly 
for a more advanced treatment. For the subiect of graph theory in its own right 


In Chapter 14 we barely give the definitions of homology and cohomology, so 
as to introduce some language. We don’t really treat the subject at all. For an 
introductory treatment at the level of this book see Giblin. For an introduction 


aucai 




finishing this text, you should be able to read Bott and Tu which is a brilliant and 
elegant treatment of many of the deep topics in algebraic topology from the 
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precise treatment of differential topology we recommend Guillemin and Pollack 

15, to 
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Chapters 16-19 treat electricity and magnetism. There are many excellent books 
on this subiect, and we have listed a selection. The one most accessible at the level 












of this book is Purcell which is elegant and extremely clearly written and which 




et al. discusses electromagnetism using differential forms. 

Two series of books which develop mathematical physics at level substantial!} 
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The first is mathematical and the second is a physics text series. Both are rough 


)ing but 


lirnng us 


The classical text on complex analysis is that by Ahlfors which can be read with 
profit after finishing our Chapter 20. Another good book is Polya and Latta. Of 




to complex analysis is Olver which comes in a big and a small version. The book 
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operators and the geometric theory of asymptotics. It makes rather heavy 
mathematical demands on the reader. 


There are hundreds of books on thermodynamics and on statistical mechanics. 
For an iconoclastic treatment of the history of the subject see Truesdell. 
way, he violently opposes the point of view we adopt here towards thermodynamics. 
The book by Born is a classic semipopular account of related topics. Our treatment 

thick book. Kittel is thinner and more elementary in its outlook than our treatment. 
A st an dard t ext is R e if and a goo d treatme nt of various topics fro m a m ore 
mathematical angle is Thompson— 

We never got to quantum mechanics in this course except for a brief mention 
of quantum logic at the very end. The best introduction to the subject from the 
mathematical point of view is still Mackey. A detailed scholarly treatment of 
quantum logic at an advanced level is Varadarajan. A balanced well thought out 
treatment with many applications is Bohm. Another user-friendly introduction is 
Sudbery. 


Ahlfors, L. Complex Analysis, McGraw-Hill, 1979 


Born, M. The Natural Philosophy of Cause and Chance, Oxford Univ. Press, 1964 
Bott, R. and Tu, L. Differential Forms in Algebraic Topology, Springer, 1982 
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This textbook has been developed from a 
course taught at Harvard over the last 
decade The course covers principally the 
theory and physical applications of linear 
algebra, and of the calculus of several 
variables, particularly the exterior 
calculus 

The authors adopt the spiral method' of 
teaching covering the same topic several 
times at increasing levels of 
sophistication and range of application 
Thus the student develops a deep 
intuitive understanding of the subject as a 
whole and an appreciation of the natural 
progression of ideas. 

This, the second volume opens with an 
introduction to algebraic topology, 
introduced by the analysis of electrical 
networks, or mathematically speaking 
the topology of one-dimensional 
complexes 

Chapters 15-18 develop the exterior 
differential calculus as a continuous 
version of the discrete theory of 
complexes Facts of the exterior calculus 
are presented: exterior algebra, /c-forms. 
pullback, exterior derivative and Stokes 
theorem 

Chapter 16 presents another physical 
theory, electrostatics The authors argue 
that the dielectric properties of the 
vacuum determine Euclidean geometry in 
three-dimensional space The basic facts 
of potential theory are presented 

Chapters 17 and 18 continue and 
conclude the study of the exterior 
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differential calculus, developing the 
notions of vector fields and flows, interior 
products and Lie derivatives, and 
applying them to magnetostatics The star 
operator is discussed in a general 
context 

Chapter 19 can be thought of as the 
culmination of the course It applies the 
results of the preceding chapters to the 
study of Maxwell s equations and the 
associated wave equations. 

The last two chapters cover complex 
analysis and elementary asymptotics, 
and the book ends with a sophisticated 
treatment of thermodynamics. 

this book will serve as a fundamental 
text not only for students in physics, but 
also for students in mathematics 
interested in the most evident 
applications of mathematical definitions 
results and theories.' 

Padiatre and Padologie 

. there is to my knowledge no 
comparable book, and it is hard to 
imagine a more inspiring one 

Times Higher Education Supplement 

Not only is the mathematics clean, 
elegant, and modern, but the presentation 
is humane, especially for a mathematics 
text Examples are provided before 
generalisation, and motivation and 
applications are kept firmly in view .. This 
is first rate 1 ' 
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